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1.0  INTRODUCTION 


The  human  thought  processes  involved  in  the  interpretation  of  an 
image  field  I(x,y),  which  may  have  been  generated  by  any  one  of  several 
types  of  imaging  devices  (for  example,  an  aerial  photography  camera,  an 
imaging  radar,  an  infrared  scanner),  are  infinite  in  variety.  The  inter¬ 
pretation  rules  which  each  of  us  have  stored  mentally  differ  according 
to  our  past  experiences  and  to  our  own  distinct  methods  of  dissecting 
and  analyzing  visual  material.  There  is  some  consistency,  however, 
among  human  judgments  of  image  quality  and  interpretability  (especially 
for  photographs). 

The  goal  of  a  quality  measure  is  the  evaluation  of  the  behavior 
of  an  imaging  system  in  as  concise  terms  and  in  as  few  parameters  as  is 
possible.  Quality  metrics  may  be  used  by  system  designers  or  by  the 
image  user;  the  former  is  knowledge \. about  the  individual  components 
of  the  sensor  but  may  not  think  of  the  system  output  in  terms  of  its 
images  while  the  latter  knows  his  application’s  needs,  so  he  knows  in 
image  terms  what  he  wants  to  get  out  of  the  system.  The  need  for  quality 
metrics  which  one  can  quickly  reference  increases  greatly  with  increased 
complexity  of  the  system  (e.g.,  more  lenses,  mirrors,  signal  processing 
filters,  and  so  forth). 

The  notion  of  an  image  quality  measure  is  especially  appealing  for 
radar  system  designers  because  one  finds  that  the  thought  processes  in¬ 
volved  in  designing  an  imaging  radar  (real  or  synthetic  aperture)  can 
tend  to  exclude  in-depth  image  application  considerations.  Worries  about 
power  consumption,  source  stability,  antenna  cross-polarization  levels, 
etc.,  can  rapidly  overwhelm  the  design  engineer  leaving  the  "goodness" 
or  utility  of  the  ultimate  product,  the  images  to  chance.  Thus  the 
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primarily  mathematical  nature  of  his  task  and  the  size  of  his  task  may 
preclude  learning  very  much  about  radar  imagery  interpretation. 

Radar  image  quality  metrics  which  relate  to  his  choice  of  design  para¬ 
meters  may  help  fill  a  gap  in  the  range  of  experience.  Quality  measures 
also  can  help  image  users  who  should  be  instrumental  in  mission  planning. 

Though  there  is  a  substantial  amount  of  literature  concerning  photo¬ 
graphic  image  quality  measures  (e.g.,  Pratt,  1978;  Linfoot,  1960),  only 
a  few  are  applicable  to  the  radar  situation.  Quite  different  kinds  of 
information  are  conveyed  by  the  photographs  and  the  radar  images  as 
they  employ  disparate  spectral  bands.  We  specifically  consider  active 
microwave  sensor  images  in  this  paper,  for  they  are  so  distinct  by  nature 
of  the  process  by  which  they  are  formed  that  quality  measures  derived 
for  incoherent  optical  systems  do  not  suffice.  The  basic  discrepancies 
between  the  two  image  types  arise  from  the  varying  degrees  of  illumina¬ 
tion  coherence  and  from  geometrical  and  other  spectral  considerations. 

As  an  example  of  the  causes  of  geometrical  differences,  the  radar  may 
consist  of  a  monostatic  arrangement  while  the  illumination  source  and  the 
sensor  for  the  photograph  are  usually  spatially  separated.  Additionally, 
the  coherent  imaging  systems  are  often  modeled  as  multiplicative  noise 
processes,  while  incoherent  image  systems  are  often  modeled  with  the 
noise  being  additive  in  effect.  The  speckle  common  to  both  coherent 
optical  and  radar  systems  dictates  that  the  nature  of  quality  measures 
and  for  example,  image  processing  algorithms,  should  be  different  for 
radar  images  and  ordinary  incoherent  optics  photographs. 

The  development  of  relationships  between  the  radar  image  metrics 
(e.g.,  dynamic  range,  signal-to-noise  ratio,  mean  square  bandwidth,  etc.) 
and  the  image  utility  for  several  applications  is  a  monumental  task. 
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The  experimental  design  which  was  adopted  by  the  authors  called  for  a 
number  of  trained  radar/photo  interpreters,  numerous  image  metrics  and 
image  applications  to  achieve  statistically  significant  results.  In  order 
to  handle  the  volume  of  data  collected  and  yet  to  maintain  visibility  of 
any  group  of  variables  of  interest,  response  surface  procedures  were 
applied  to  the  data  analysis  (Myers,  1971).  This  methodology  allows  one, 
for  instance,  to  arrange  in  order  of  importance  the  image  metrics  given 
an  application  (assuming  that  the  necessary  supporting  data  were  collected). 
Prior  to  presentation  of  these  results  we  will  discuss  first  the  general 
principles  of  radar  image  formation  and  second,  we  will  give  a  review  of 
previous  radar  image  quality  studies. 

It  is  hoped  that  the  new  results  documented  here  will  be  not  only 
interesting  but  also  useful  for  those  scientists  and  engineers  whose  tasks 
relate  to  imaging  radar.  The  images  of  such  devices  are  truly  fascinating 
as  they  present  a  different  way  of  "looking"  at  terrain,  whether  it  is  a 
portion  of  the  surface  of  the  Earth  or  of  the  surface  of  Venus.  Under¬ 
standing  the  information  in  the  radar  image  and  knowing  how  to  design  the 
active  microwave  system  to  gather  relevant  information  are  significant 
achievements;  these  were  our  goals  at  the  onset  of  this  study. 

1.1  Background 

The  organization  of  the  following  sections  is  intended  to  stress 
image  evaluation  in  the  SAR  context.  Previous  quality-interpretability 
work  for  SAR  is  discussed,  along  with  the  coherent  speckle  literature, 
and  then  work  in  the  processing  of  images,  containing  signal  dependent 
noise  is  covered.  Analogous  quality/interpretability  work  in  the  optical 
systems  literature  is  reviewed.  Test  target  considerations  for  image 
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evaluation  are  briefly  described.  But  first  it  is  worthwhile  to  talk 
about  the  concepts  of  image  interpretability  and  quality  for  radar 
imagery. 

1.1.1  The  Notions  of  Quality  and  Interpretability  for  SAR 

When  one  speaks  of  quality  and  interpretability  investigations, 
it  is  obvious  that  even  the  definitions  of  these  two  terms  are  uncer¬ 
tain.  Here  several  thoughts  on  the  distinctions  between  the  two,  and 
also  on  their  relations  to  one  other,  perhaps  measurable,  physical 
features  or  quantities  are  discussed.  Pratt  (1978)  and  Haralick 
(1978)  have  also  considered  the  terminology  difficulties  implied 
here,  in  the  context  of  digital  image  processing. 

Foremost,  we  equate  SAR  image  quality  and  interpretability  to  image 

fidelity  and  intelligibility,  respectively,  following  Pratt's  (1978) 
notation.  The  thoughts  expressed  below  represent  our  opinions  on  the 
term  quality  (fidelity): 

(1)  Image  quality  cannot  be  consistently,  directly,  equated 
to  interpretabil ity  (intelligibility). 

(2)  A  high  quality  SAR  system  produces  an  output  (an  image) 
which  selectively  mimics  the  input  (just  prior  to  the 
antenna)  in  its  spatial  or  spatial  frequency  structure; 
processing  such  as  azimuth  correlation,  range  compression, 
and  speckle  reduction  are  necessary  in  the  production  of  an 
intelligible  (to  a  human)  scene. 

(3)  Geometric  fidelity  is  important  for  a  quality  image  or  imaging 
sensor  (e.g.,  since  processing  of  a  slant  range  radar  image 

is  based  partly  upon  the  assumption  of  range  perspective,  then 
it  is  important  for  the  image  to  conform  to  these  expectations). 
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(4)  Quality  measures  do  not  rely  on  the  previous  existence  of 
mental  matched  filter  for  objects  in  the  scene  (Barnard,  1972). 

(5)  Quality  does  not  particularly  relate  to  size,  shape,  or 
orientation  of  features  in  the  target  scene. 

(6)  Question  -  Is  the  quality  relatable  (analytically  or  empirically) 
to  the  complexity  of  the  restoration  filter  which  takes  the 
scene  back  to  the  "ideal  image"? 

(7)  Question  -  How  is  quality  (fidelity)  related  to  the  information 
content? 

(8)  A  fidelity  measure  can  be  applications  independent,  as 
opposed  to  an  interpretabil ity  measure. 

(9)  Existing  quality  measures  can  generally  be  broken  into  the 
univariate  and  bivariate  types;  the  former  consists  of 
measurements  made  on  a  single  image  field  while  the  latter 
involves  numerical  comparison  between  a  pair  of  images 
(e.g.,  between  the  test  and  the  "ideal"  image). 

Image  interpretabil ity  or  intelligibility  can  be  understood  to  be 
distinct  from  quality  (fidelity)  in  lignt  of  the  following  facts  (and 
opinions): 

(1)  Interpretability  is  related  to  the  human  observer's  previous 
experience  (with  SAR  imagery);  the  existence  of  a  mental 
matched-filter  is  important  (Barnard,  1972). 

(2)  Geometric  fidelity  is  less  critical  in  general  for  intelli¬ 
gibility  as  opposed  to  fidelity  measures  (for  modest 
geometric  warpings). 

(3)  Interpretability  can  be  aided  in  some  instances  by  stereo¬ 
scopic  or  other  special  viewing  capabilities. 
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(4)  Interpretability  can  be  improved  by  appropriately  designed 
enhancement  or  restoration  filters. 

(5)  Intelligibility  can  be  improved  by  scale  changes  and  scene 
rotations  (Barnard,  1972). 

(6)  Object  recognition,  related  to  interpretability,  involves  a 
time  element  (unlike  quality). 

(7)  Interpretability  is  related  to  the  image  coding  scheme;  for 
instance,  negating  the  gray  scale  coding  can  impede  recogni¬ 
tion,  compressing  the  dynamic  range  in  the  image  can  lead  to 
poorer  interpretabi 1 ity ,  etc. 

(8)  Question  -  How  is  the  level  of  interpretability  related  to  the 
the  complexity  of  a  restoration  filter? 

(9)  Question  -  How  are  information  content  and  image  interpret¬ 
ability  linked?  A  low  information  content  scene  can  be 
highly  interpretable  and  yet  a  great  deal  of  information 
content  does  not  guarantee  a  large  "interpretability"  factor. 

(10)  Definitely,  interpretability  is  appl ications-dependent. 

These  remarks  have  been  introduced  to  preface  the  following  review 
of  literature  on  radar  and  optical  quality  and  interpretability  re¬ 
search.  One  notices  that  the  interchanging  use  of  these  terms  is  wide¬ 
spread  and  little  care  is  generally  taken  to  define  the  concepts 
before  pursuit  of  the  various  measures. 

1.1.2  Previous  Quality  and/or  Interpretability  Work  for 
Radar  Imagery 

The  unclassified  literature  has  been  reviewed  and  a  number  of  publi¬ 
cations  regarded  to  be  significant  will  be  discussed.  In  all  cases  but 
two,  radar  imagery  examples  did  not  accompany  the  published  versions  of 
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the  quality  and/or  interpretabil  ity  studies.  Moore  (1979)  presented 
various  examples  of  radar  imagery,  and  R.L.  Mitchell  (1974)  employed 
radar  speckle  simulations  for  his  experiments. 

W.A.  Penn  (1962)  authored  one  of  the  earliest  unclassified  papers 
on  signal  fidelity  for  radar  imagery.  His  discussions  center  upon 
"background"  interference  between  output  signal  cells,  multiplicative 
(fading)  noise,  and  processing  and  display  nonlinearities.  Incoherent 
(post-detection)  integration  is  introduced  to  lessen  the  fading  variations. 
Through  a  heuristic  argument,  Penn  suggests  a  measure  of  the  radar  map 
"qua! ity"  to  be  the  product  of  P-Q  where  P  is  the  number  of  samples 
averaged  and  Q  is  the  average  "signal-to-correlation  noise."  In  Penn's 
notation,  the  correlation  noise  is  larger  than  the  amplitude  of  the  noise 
by  a  factor  of  the  time  bandwidth  product.  Simulations  employing  aerial 
photographs  were  used  for  demonstration  and  for  empirical  derivation  of 
the  P-Q  conclusion. 

D.G.  Corbett  et  al.  (1964)  developed  a  recognition  metric  for  radar 
imagery  target  features.  It  was  assumed  that  the  inverse  of  the  time 
taken  (by  equally  trained  observers)  to  reach  a  correct  decision  could 
be  functionally  related  to  size  of  the  target  and  a  number  of  trans¬ 
missivity  measures  of  the  target  and  its  background.  A  disappointingly 
low  correlation  coefficient  was  observed  between  the  developed  prediction 
equation  and  the  target-recognition  times  when  applied  to  new  radar  imagery 
(similar  to  that  employed  in  the  study). 

R.O.  Harger  (1973)  performed  a  theoretical  study  of  SAR  imagery  that, 
although  different  from  all  the  other  interpretabil ity  studies  mentioned 
herein,  is  included  for  completeness.  While  most  SAR  systems  are  designed 
around  impulse  response  criteria,  Harger  suggests  a  design  to  minimize 
the  probability  of  classification  error  (it  is  assumed  that  the  field 


/ 


to  be  classified  is  a  known  region  whose  boundaries  are  predetermined). 

Given  the  SAR  system,  noise  and  reflectivity  density  spectral  density 
models  assumed,  the  decision  problem  set  up  by  Harger  is  relevant  for  a 
"Gaussian  signal  in  Gaussian  noise."  The  solution  for  the  optimum  clas¬ 
sification  role  involves  a  nonlinear  filter  which  includes  a  matched 
filter  (i.e.,  the  solution  for  impulse  response  design  criteria). 

R.L.  Mitchell  (1974)  presents  an  interesting  simulation  experiment 
in  which  he  begins  with  a  cross  on  a  dark  background  to  study  sensi¬ 
tivity  of  the  SAR  image  with  respect  to  resolution  and  averaging.  The  cross 
consists  of  many  pixels  whose  individual  grey  shades  are  taken  from  sample 
functions  of  random  distributions,  e.g.,  Rayleigh,  and  log-normal  for 
several  different  variances.  Mitchell  then  varies  the  incoherent  averaging, 
background  noise,  and  film  response.  His  conclusion  is  that  image  "quality" 
(which  he  is  equating  with  the  recognition  of  the  cross  shape)  varies 
most  directly  with  the  image  resolution.  His  results,  though  interesting 
and  useful,  would  have  been  more  realistic  if  the  backgrounds  had  also 
been  radar  speckle  patterns  (from  similar  distributions,  different  means, 
etc.)  rather  than  constant  tone. 

R.H.  Mitchel  (1974)  developed  a  SAR  image  quality  analysis  model 
in  his  Ph.D.  dissertation  (University  of  Michigan.)  His  thinking  is 
strongly  influenced  by  the  thorough  review  he  presents  on  optical  image 
quality  measures.  The  "Radar  Threshold  Quality  Factor"  developed  by 
Mitchel  incorporates  measures  of  the  impulse  response  main  lobe  width, 
the  signal -to-noise  ratio,  noncoherent  integration,  and  human  visual 
system  factors.  The  RTQF  is  assumed  to  have  a  Gaussian  shaped  impulse 
response  (effects  of  impulse  response  sidelobes  are  not  included).  Addi¬ 
tionally,  the  clutter  model  employed  is  additive  rather  than  multiplica¬ 
tive  for  the  sake  of  mathematical  tractabil ity.  Validation  of  the  trends 
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predicted  by  the  RTQF  was  accomplished  by  experimentation  with  a  radar 
holographic  viewer  built  by  the  author. 

6.R.  DiCaprio  and  J.  Wasielewski  (1976)  performed  an  interpreter 
study  using  conventionally  processed  and  incoherently  degraded  SAR 

imagery.  Of  interest  was  the  target  detection  accuracy  for  the  test  scenes 
and  the  times  required  to  perform  the  identifications.  The  conclusions 
of  the  authors  were  that  the  incoherent  degradation  allowed  the  radar 
interpreters  to  detect  targets  significantly  faster  and,  secondly,  that 
if  more  supporting  ground  truth  data  had  been  available  the  noncoherent 
degradation  would  have  improved  detection  accuracy  also. 

D.W.  Craig  and  M.L.  Hershberger  (1977)  also  reported  results  of 
an  interpreter  study.  Employing  trained  observers,  they  investigated 
the  effects  of  radar  sensor,  display,  and  mission  variables  on  tactical 
target  acquisition  performance.  Serious  criticism  of  their  results  arises 
because  target  types,  area  coverage  and  display  quality  were  dissimilar 
for  the  high/low  resolution  imagery  cases.  However,  simply  restating 
their  conclusions,  for  their  tasks  and  experimental  set-up,  higher  resol¬ 
ution  imagery  performed  superiorly  to  lower  resolution  imagery  in  all  tests. 
R.K.  Moore  (1979)  performed  a  sensitivity  study  to  determine  the  effects 
on  radar  image  interpretation  of  spatial  resolution,  using  non-square 
pixels,  and  noncoherent  averaging.  This  research  will  be  discussed  later 
in  this  document. 

1.1.3  Radar  Speckle  and  Multiplicative  Noise  Literature 

In  attempts  to  analytically  or  empi.ically  deal  with  SAR  image 
quality,  interpretabil ity,  classification,  edge  detection,  and  so  on, 
many  investigators  have  performed  research  on  the  effects  of  the  coherent 
nature  of  SAR  imaoery.  The  importance  of  specie  modeling  has  become 
apparent  as  techniques  of  wide  band  optical  image  proressing  (which  do 
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not  have  to  deal  with  small  si gnal -to-noi se  ratios)  have  failed  to  be 
applicable  to  SAR  imagery  (e.g.,  especially  edge  detection  algorithms). 

It  seems  particularly  relevant  to  review  the  literature  in  this  field 
in  light  of  the  fact  that  SAR  impulse  response  quality  criteria  are 
deterministic  in  nature  and  yet  random  SAR  speckle  is  recognized  as  a 
primary  degradation  factor  in  many  interpretabil i ty  studies. 

J.W.  Goodman  (1976)  presents  some  of  the  fundamental  character¬ 
istics  of  speckle  and  derives  the  exponential  probability  density  function 
for  certain  speckle  situations.  He  also  demonstrates  the  important  fact 
that  addition  of  M  uncorrelated  speckle  patterns  on  an  intensity  basis 
improves  viewing  and  reduces  the  image  contrast  (o/u)  by  1/fffi.  Time,  space, 
frequency,  or  polarization  diversity  is  utilized  to  obtain  the  independent 
speckle  patterns,  as  is  well  known  in  SAR  theory  and  operation.  A  good 
bibliography  of  historically  interesting  and  contemporary  articles  con¬ 
cludes  this  paper. 

A.  Kozma  and  C.R.  Christensen  (1976)  illustrate  an  experiment  in 
which  both  a  grating  and  a  continuous  tone  image  were  illuminated 
coherently  and  incoherently.  Subjective  analyses  showed  that  the  speckle 
masks  spatial  information  present  in  the  image  and  has  the  effect  of 
increasing  the  minimum  resolution  patch  that  can  be  obtained  with  a  given 
aperture  area.  For  imaging  the  grating  (bar  targets)  it  was  determined 
subjectively  -  that  the  .aperture  of  the  coherent  system  needed  to  be 
about  twice  as  large  as  that  for  an  equivalent  incoherent  illumination 
system  to  obtain  equal  resolution.  For  the  continuous  tone  target  employed, 
the  aperture  ratios  increased  to  approximately  five.  This  difference 
between  the  two  target  types  w*s  hypothesized  to  be  related  to  differences 
in  complexity  of  the  decision  process.  Noncoherent  addition  of  independent 
speckle  patterns  to  achieve  a  signal-to-noise  ratio  of  about  ten  allowed 
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the  coherent  and  incoherent  system  performances  to  be  approximately  equal 
as  concerned  apparent  resolution  capability.  A  similar  result  was  found 
for  the  continuous  tone  target  (S/N  =  10).  This  agrees  well  with  the 
findings  of  Moore  (1979). 

J.S.  Zelenka  (1976)  discusses  the  mathematics  supporting  a  discrete 
and  a  scanning  system  for  SAR  speckle  reduction.  As  in  many  other  dis¬ 
cussions,  the  averaging  is  accomplished  by  processing  subapertures 
coherently  and  then  adding  the  resulting  images  on  an  intensity  basis. 

The  two  methods  described  differ  in  that  the  discrete  method  forms  non¬ 
overlapping  independent  apertures  while  the  samples  are  not  in  this  sense 
independent  for  the  scanning  case,  though  effectively  there  are  more 
samples.  The  conclusion  of  Zelenka  is  that  the  scanning  processor  is 
more  effective  for  speckle  reduction  in  terms  of  signal -to-noise  improve¬ 
ment  vs.  loss  in  resolution  given  a  certain  resolution  of  the  final, 
output  image. 

Porcello  et  al .  (1976)  also  treat  the  topic  of  image  speckle  reduc¬ 
tion  for  SAR.  Frequency  and  angular  diversity  are  suggested  in  the  mixed 
integration  processor  context.  A  series  of  noncoherently  averaged  radar 
images  is  presented  to  demonstrate  improved  viewing  capabilities.  The 
same  basic  information  presented  by  Zelenka  (1976)  is  given  in  this 
paper. 

In  addition  to  the  studies  of  the  statistical  nature  of  coherent 
speckle,  several  articles  have  been  published  on  image  processing  in  the 
context  of  multiplicative  noise.  This  research  is  reviewed  because 
speckle  is  a  form  of  multiplicative  noise.  Whereas  the  authors  refer¬ 
enced  thus  far  in  this  section  have  attempted  speckle  reduction  at  the 
point  in  radar  image  formation  when  amplitude  and  phase  are  both  control- 
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Table,  the  next  few  authors'  (Frost  et  al.,  1981)  works  examine  the  case 
of  speckle  reduction  when  only  magnitude  (no  phased  data  is  available  to 
the  investigator;  that  is,  image  data  (not  signal  film  or  radar  holograms) 

was  experimented  upon. 

Walkup  and  Choens  (1974)  and  Kondo  et  al.  (1977)  discuss  image 
processing  and  restoration  by  the  use  of  a  Wiener  filter  when  the  noise 
process  is  modeled  as  signal-dependent  (rather  than  being  treated  as 
additive).  The  signal  and  noise  are  both  considered  to  be  wide  sense 
stationary  random  processes,  with  the  noise  spectral  density  assumed 
known.  The  Wiener  filter  derived  is  non-adaptive  (because  of  the  sta- 
tionarity  assumptions).  Experimental  results  indicate  that  after  Wiener 
filtering  is  done  to  estimate  the  signal,  edge  detection  algorithms 
applied  on  the  output  have  a  greater  probability  of  representing  true 
boundaries,  rather  than  false  boundaries  characteristical ly  produced  on 
speckled,  non-smoothed  radar  or  photographic  imagery.  The  above  work 
was  done  for  restoration  of  images  corrupted  by  film  grain  noise. 

Naderi  and  Sawchuk  (1978)  report  the  results  of  running  adaptive 
Wiener  filters  on  images  degraded  by  film  grain  noise.  The  adaptive 
nature  of  the  filter  is  necessitated  because  of  the  nonstationarity  of 
the  image  first  order  statistics.  The  results  presented  demonstrate 
the  improvements  brought  about  by  making  the  Wiener  filters  adaptive. 

Oppenheim  et  al.  (1968)  discuss  nonlinear  filtering  of  multiplied 
and  convolved  signals  to  develop  the  "homomorphic"  filter,  not  con¬ 
strained  by  linearity  assumptions,  to  produce  an  estimate  of  the  desired 
signal.  The  homomorphic,  Wiener,  inverse,  and  constrained  least  squares 
estimation  filters  are  discussed  in  a  tutorial  digital  image  processing 
article  by  B.R.  Hunt  (1975).  The  interesting  result  of  his  restoration 


techniques  shows  the  constrained  least  squares  (CLS)  filter  to  produce 
the  most  visually  pleasing  image  from  a  degraded  version  of  the  same 
scene.  Though  theoretically  the  Wiener  filter  is  the  optimal  linear 
filter,  Hunt  hypothesizes  that  the  human  visual  system  transfer  function 
"matches"  better  with  the  output  of  the  CLS  and,  thus,  gives  the  human 
observer  the  impression  of  a  superiorly  reconstructed  scene. 

Summarizing  this  section,  one  finds  that  even  for  the  basically 
"error-free"  SAR  system,  in  which  fading  noise  is  the  dominant  degradation 
factor,  extracting  information  from  or  digitally  restoring  the  imagery  can 
be  greatly  aided  by  a  priori  knowledge  of  the  fading  statistics.  Forma¬ 
tion  of  noise  models  is  an  obviously  important  step  in  the  design  of  the 
restoration  (signal  estimation)  filter.  Not  only  should  the  knowledge  of 
speckle  characteristics  be  applied  when  one  develops  SAR  interpretability 
measures,  but  also,  techniques  for  speckle  noise  removal  will  have  appli¬ 
cation  to  correction  of  other  types  of  SAR  "errors".  The  formulation  of 
an  image  restoration  technique  must  also  enter  into  the  quantification  of 
interpretability/quality  measures  of  a  SAR  system  degraded  intentionally 
in  the  proposed  study. 

1.1.4  Interpretability/Quality  Literature  in  the  Optics  Field 

The  understanding  of  imaging  optical  and  radar  systems  is  greatly 
enhanced  by  formation  of  a  linear  systems  model  (to  a  lesser  extent 
by  a  nonlinear  mathematical  model).  Both  types  of  systems  (radar  and 
optical)  have  been  characterized  by  a  two-dimensional  point  spread 
function  (psf);  attempts  at  quantifying  system  quality  by  singling  out 
measures  of  the  psf  have  been  less  than  successful  (Brock,  1967). 

The  difficulties  in  using  impulse  response  criteria  or  modulation 
transfer  function  criteria  have  been  discussed  in  the  optical  systems 
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context  by  Brock  (1967)  and  Noffsinger  (1970,1971).  Boiling  down  their 
discussions  to  the  essence,  one  finds  that  quality/interpretability 
measures  that  do  not  factor  in  the  application  or  the  observation  cir¬ 
cumstances  fail  to  adequately  represent  the  system  in  question.  Granger 
and  Cupery  (1972)  attempt  to  incorporate  the  human  factor  in  their 
article  "An  Optical  Merit  Function  (SQF),  Which  Correlates  with  Sub¬ 
jective  Image  Judgments."  A  visual  system  model  is  used  to  develop  the 
SQF.  Similar  visual  system  models  are  applied  by  Stockham  (1972)  in 
an  excellent  article  and  in  Hunt  (1975).  Many  other  references,  too 
numerous  to  mention,  relevant  to  image  quality/interpretability  exist; 
a  good  bibliography  is  presented  in  Brock  (1967).  Mitchel  (1974) 
selected  several  of  these  articles  and  others  that  he  felt  were  pertinent 
to  SAR  quality. 

1.1.5  Data  Base  (Test  Target)  Design  for  Interpretabil ity/Qual ity 
Studies  """  ~ 

Just  as  one  would  not  test  a  recipe  using  poor  quality  ingredients, 
one  must  also  carefully  choose  test  targets  for  system  analysis.  A  common 
mistake  to  be  avoided  in  SAR  image  quality  work  is  use  of  a  test  scene  which 
is  unsuitable  because  of  its  spectral  composition.  For  example,  many 
workers  in  the  optical  field  have  employed  a  tri-bar,  or  multiple  bar 
target  mistakenly  believing  that  this  target  models  a  sine  wave  transmit¬ 
tance  in  the  space  domain  and,  thus,  believing  that  its  spectrum  can  be 
modeled  as  a  delta  function  in  the  spatial  frequency  domain  (Brock,  1967). 
Thus,  for  either  simple  or  complex  scenes  the  power  spectra  should  be 
known,  and  should  be  suitable  for  the  experiment  at  hand.  For  example, 
sufficient  bandwidth,  or  "whiteness"  of  the  spectrum  of  a  target  scenario 
might  be  important.  If  the  spectrum  does  not  satisfy  one's  criteria,  it 
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can  be  augmented  and  the  new  scene  defined  by  the  inverse  transform  of  the 
augmented  spectral  version. 

T.W.  Barnard  (1972)  discusses  another  aspect  of  target  selection 
in  his  article  "Image  Evaluation  by  Means  of  Target  Recognition."  The 
emphasis  is  not  on  spectral  content  in  this  case,  but  rather  he  is  concerned 
with  providing  targets  to  the  human  observers  for  which  they  already 
possess  mentally  stored  visual  "matched  filters."  Barnard  gives  the 
examples  of  the  "Landolt-C,"  numerals,  and  "SDokes"  targets.  Similar  in 
nature  are  the  vision-testing  charts  containing  letters  of  our  alphabet, 
and  those  having  the  "E's"  opening  up,  down,  right,  left.  Barnard  would 
e.g.,  define  as  unacceptable  a  chart  of  Cyrillic  alphabet  characters  for 
the  English  speaking  observers.  This  would  seem  to  be  an  important  con¬ 
sideration  for  SAR  system  interpretability  if  only  humans  not  familiar 
with  radar  interpretation  are  available  to  rank  or  describe  simulated 
scenes.  Pratt  (1978)  also  discusses  target  scene  selection. 


2.0  EXPERIMENTAL  DESIGN 


An  experimental  design  in  general  consists  of  selecting  the  different 
conditions  under  which  observations  will  be  obtained.  Proper  selection 
of  these  conditions,  i.e.,  fully  exploring  the  likely  range  of  opera¬ 
tional  conditions,  is  essential  for  efficient  estimation  of  a  relation¬ 


ship  between  the  experimental  variables  and  the  response.  In  this  case 
the  experimental  conditions  refer  to  a  specific  level  of  image  "quality". 
Therefore,  to  obtain  radar  images  with  controlled  levels  of  utility  a 
radar  image  was  processed  (degraded)  using  digital  techniques.  The 
response  in  this  case  is  the  judgment  of  human  interpreters  of  the 
utility  of  the  degraded  radar  images. 

An  experiment  was  designed  to  investigate  the  relationship  between 
measured  image  quality  parameters  (IQP)  and  image  utility  for  several 
applications.  Because  of  the  large  number  of  data  points  (degraded 
images)  required  to  totally  explore  desired  relationships  it  was  decided 
to  assume  that  all  third  and  higher  order  interactions  between  the  IQP's 
could  be  deleted.  That  is,  a  second  order  model  was  assumed.  This 
assumption  has  been  successful  in  the  past  (Soliday,  1974;  Willi ges , 

1971;  Craig  and  Hershberger,  1977). 

Techniques  outlined  in  Myers  (1971)  were  used  to  efficiently  specify 
the  number  and  level  (of  degradation)  of  the  required  images.  A  central 
composite  design  (CCD)  with  uniform  precision  was  selected.  For  example, 
with  five  image  metrics,  32  different  experimental  conditions  are  required. 
The  advantage  of  this  approach  was  that  the  minimum  number  of  observations 
were  used  and  that  each  degradation  was  selected  so  that  all  the  data 
points  had  uniform  importance.  Eight  extra  degraded  images  were  added 
to  comprise  the  data  set. 
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Optically  processed  synthetic  aperture  radar  imagery  of  the  Roanoke, 


Virginia  area  was  obtained  in  the  form  of  film  positive  transparencies . 

The  radar  system  was  the  Goodyear  XA2  which  has  a  resolution  (6  dB  width 
of  the  image  impulse  response)  of  approximately  15  ft.  The  transmitted 
carrier  frequency  was  approximately  10  GHz  and  the  system  transmitted  and 
received  horizontal  polarizations.  The  parameters  of  the  flight  which 
collected  this  imagery  include  a  20,000  foot  altitude  with  a  near  range 
distance  (ground  track  to  near  range)  of  8  miles  and  a  far  range  distance 
of  10  miles.  The  flight  date  was  June  20,  1968.  The  scale  of  the  re¬ 
ceived  imagery  was  1:100,000.  The  selected  imagery  was  digitized  using 
a  sampling  rate  of  1000  pt/in.;  this  yields  one  pixel  every  8.3  ft. 
which  is  close  to  the  required  Nyquist  spacing  of  7.5  feet.  Any  aliasing 
effects  were  thus  ignored  as  were  the  effects  of  the  sampling  aperture. 

Four  subareas  of  this  imagery  were  selected  for  analysis  because  of  the 
variety  of  targets  contained  therein.  These  scenes  are  shown  in  Figures  1-4; 
Figures  1  and  2  are  aerial  photographs  and  Figures  3  and  4  are  the  original 
radar  images.  The  40  test  images  were  generated  from  these  images.  That  is, 
a  complete  data  set  consisted  of  10  degradations  of  scene  A,  8  degradations 
of  scene  B,  10  degradations  of  scene  C,  and  12  degradations  of  scene  D  as 
prescribed  by  the  CCD. 

Each  of  eight  trained  Army  photo-interDreters  (PI)  was  given  a  complete 
set  of  40  degraded  images,  a  brief  introduction  to  the  experiment  explaining 
its  goals  and  the  interpreter' s  role.  Also  included  were  instructions, 
questions,  and  interpretation  guidelines.  An  answer  sheet  was  provided 
for  each  image.  The  answer  sheet  allowed  each  PI  to  rank  each  image 
according  to  his  ability  to  identify,  classify,  and  detect  specific 
terrain  features.  For  analysis  the  Pi's  responses  were  averaged  to  form 
four  basic  response  categories:  (1)  linear  features,  (2)  natural  area- 
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extensive  features,  (3)  complex  area  features,  and  (4)  individual  man¬ 
made  targets.  Further  each  PI  was  asked  to  rank  order  each  of  the  de¬ 
graded  images  of  each  scene  from  best  to  worst,  using  both  vehicle 
movement  and  activity  level  as  criteria  thereby  providing  a  fifth  re¬ 
sponse  category.  However,  in  all  cases  the  ranking  was  identical  for 
both  applications.  In  addition,  auxiliary  data  concerning  the  target 
area  (e.g.,  maps,  aerial  photographs  and  large  area  coverage  SAR  imagery 
were  also  provided.  (See  Appendix  A  for  an  interpreter  package.) 

In  contrast  to  most  image  quality  experiments,  both  absolute  and 
relative  (rankings)  responses  for  the  degraded  images  were  obtained. 

This  permitted  the  investigation  of  the  correlation  between  the  responses 
to  various  questions. 


3.0  DEGRADATION  PROCEDURES 

The  purpose  of  this  section  is  to  explain  the  processing  steps 
that  were  performed  on  the  digitized  radar  imagery  to  produce  the 
desired  set  of  experimental  conditions.  That  is,  the  digitized  radar 
images  were  processed  to  exhibit  controlled  levels  of  image  quality. 

Thus  for  one  digitized  radar  image  several  degraded  images  were  gener¬ 
ated  and  then  evaluated  by  human  interpreters.  In  this  section  the 
five  processing  algorithms,  spatial  frequency  filtering,  geometric 
distortion,  noise  addition,  quantizing,  and  spatial  domain  averaging, 
which  were  used  to  degrade  the  radar  images  are  presented. 

A.  Spatial  Frequency  Filtering 

In  the  first  processing  step  the  digitized  radar  images  were 
ideal  low  pass  filtered  in  the  spatial  frequency  domain.  The  purpose 
of  this  step  was  to  limit  the  frequency  content  of  the  observed 
images. 

B.  Geometrical  Distortion 

Most  image  quality  experiments  do  not  consider  geometric  fidelity. 

A  circularly  symmetric  geometric  distortion  was  applied  to  the  radar 
images.  The  distortion  is  defined  by  a  simple  sinusoidal  compression, 
i  .e. , 

aD  =  sine  aI  (3.1 ) 


where: 


aI  =  a  change  in  distance  in  the  ideal  (undistorted)  image 
aD  =  a  change  in  distance  in  the  distorted  image, 


cose,  -  cose- 

tane  =  tane,  +  -  — — - — — — — 
i  w  COS0-J  cose2 


(3.2) 


The  two  angles  e^  and  e2  are  constants  and  vary  the  degree  of  geometric 
distortion.  The  distance  to  the  center  of  the  image  from  the  edge  is 
w  and  the  distance  from  the  center  of  the  image  to  the  center  of  aI  is 
x.  This  type  of  distortion  is  similar  to  both  the  pin-cushion  distortion 
and  in  radar,  near  range  comDression.  By  controlling  the  geometric 
fidelity  of  the  degraded  images  it  was  hoped  that  a  relationship  between 
this  image  characteristic  and  image  utility  could  be  defined. 


C.  Additive  Noise 

Most  image  quality  studies  include  the  effect  of  additive  white 
Gaussian  noise.  The  third  degradation  applied  to  the  original  scenes 
was  the  addition  of  such  noise.  It  is  well  known  that  for  radar, 
fading  noise  is  more  significant  than  receiver  noise.  Receiver  noise 
is  usually  modeled  as  being  white  additive  while  fading  noise  is 
neither  white  nor  additive. 


0.  Number  of  Quantized  Levels 

An  important  parameter  in  system  design,  especially  when  digital 
processing  is  used  or  when  the  image  data  are  transmitted  over  a  communica¬ 
tion  channel,  (as  was  the  case  with  satellite  data)  is  the  number  of 
quantized  levels.  This  should  be  minimized  while  still  maintaining  a 
specified  level  of  system  performance.  Therefore  the  number  of  quantized 
levels  was  varied  to  establish  a  relationship  between  radar  image  utility 
and  the  number  of  signal  levels. 


E .  Spatial  Domain  Filtering 


The  final  algorithm  applied  to  the  radar  images  was  a  simple 
uniform  weighted  square  spatial  domain  filter.  The  purpose  of  this 
algorithm  was  to  simulate  various  levels  of  system  resolution. 
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4.0  RADAR  IMAGE  QUALITY  METRICS 

A  set  of  five  metrics  was  selected.  Multiple  metrics  were  used 
because  it  was  hypothesized  that  no  one  metric  could  adequately  characterize 
the  utility  of  radar  images.  These  metrics  were  intended  to  be  independent 
measures  of  basic  properties  of  radar  imagery.  There  was  one  exception 
to  this  guideline,  a  root  mean  square  error  criterion  was  applied  only 
because  of  its  extensive  use  in  other  image  analysis  research. 

Most  of  the  quality  metrics  proposed  in  previous  work  can  be  re¬ 
lated  to  one  of  these  four  metrics.  For  example,  many  researchers  have 
proposed  some  measure  of  sharpness,  resolution,  edge  quality,  busyness, 
etc.  All  of  these  are  directly  related  to  the  bandwidth  of  the  image. 
Therefore,  the  first  of  our  metrics  was  a  root  mean  square  bandwidth. 

Image  quality  has  also  been  related  to  the  dynamic  range  and  signal - 
to-noise  ratios;  these  image  characteristics  were  also  measured  in  this 
experiment.  The  last  metric  was  specifically  designed  to  estimate  geo¬ 
metric  fidelity.  As  mentioned  previously  geometric  fidelity  is  an  un¬ 
common  parameter  to  be  treated  in  image  quality  studies,  and  as  such  a 
new  metric  had  to  be  developed;  this  will  be  presented  later.  Each  of 
the  metrics  was  adapted  for  use  on  a  digital  computer. 

In  the  following  paragraphs  the  impl ementat’cn  of  each  of  the  metrics 
will  be  discussed. 

A.  Mean  Square  Bandwidth  (MSB) 

The  definition  of  the  MSB  for  a  continuous  one-dimensional  signal, 
f(t),  is 
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2 

<0>  >  = 


w2|f(.)I2  ^ 


F(a))r  da) 


(4.1) 


where: 

F (to)  <=>  [f(t)] 

The  fast  Fourier  transform  (FFT)  is  required  to  obtain  F(w)  for  discrete 
signals.  Defining  F(iAw)  as  the  FFT  of  f(t)  then  the  MSB  can  be  measured 
by  evaluating 


■V  ^ 

^2  (i-l)2  j  F  ( i  Au> )  j ' 


2  .  i=l 

<U)  >  -  (Aw) 


(4.2) 


|  F  ( i  AO) )  r 


where: 


At  =  sample  spacing 
N  =  number  of  sample  points 

The  actual  metric  used  was  /<l0^>  ,  the  root  mean  square  bandwidtii  ( R.“S 3 ) . 
To  use  the  RMSB  as  a  valid  indication  of  changes  in  image  bandwidth  for 
SAR  imagery  it  must  be  assumed  that  the  original  imagery  was  fully  fo¬ 
cused.  The  RMSB  is  either  directly  or  indirectly  relatable  to  many  pre¬ 
viously  proposed  image  quality  metrics,  e.g.,  sharpness.  The  above 
definition  is  only  valid  for  one-dimensional  signals.  To  extend  its 
use  to  two-dimensions  a  further  assumption  was  made,  i.e.,  the  MSB  of  a 
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two-dimensional  image  can  be  aDproximated  by  the  product  of  the  RMSB  in 
orthogonal  directions,  e.g.,  along  columns  and  rows  of  a  digital  image. 

This  is  a  common  assumption  which  is  particularly  valid  for  imaging 
radars  (Harger,  1970). 

B.  Root  Mean  Square  Error  (RMSE) 

The  RMSE  is  commonly  used  as  a  quality  criterion  in  image  analysis 
research,  especially  in  the  bandwidth  compression  area.  To  allow  for  the 
comparison  of  the  work  with  previous  research  the  MSE  as  defined  below 
was  measured  on  each  degraded  image. 

^  =  S  E (I(i,j)  ■  d(ij))2  (4-3) 

N  i=i  j=i 

where: 

2 

e  =  Mean  Square  Error 

NxN  =  dimension  of  the  image 

I ( i , j )  =  pixel  value  at  location  i,j  in  the  original  SAR  image 

D(  i ,  j )  =  pixel  value  at  location  i,j  in  the  degraded  SAR  image. 

C.  Image  Dynamic  Range 

Dynamic  range  is  a  measure  of  the  relationships  between  the  "minimum" 
and  "maximum"  shades  of  grey  (grey  levels)  in  an  images.  There  are  dif¬ 
ferent  ways  of  defining  both  dynamic  range  and  the  minimum  and  maximum 

grey  levels.  Since  we  are  working  with  digital  imagery  it  is  almost  a 
certainty  that  at  least  one  picture  element  (i.e.,  an  outlying  data  point) 
will  have  the  absolute  minimum  and  another  will  have  the  absolute  maximum 
of  the  allowable  grey  levels.  A  relative  frequency  approach  is  more  suitable 
for  our  problem.  That  is,  the  minimum  and  maximum  values  would  be  deter- 
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mined  a  priori  by  establishing  a  fixed  probability,  for  example,  5  ,  and 
measuring  the  highest  and  lowest  grey  levels  which  enclose  90>.;  of  the 


grey  values.  These  grey  levels  would  then  become  effective  n.i "limun  anc 
maximum  values  for  the  image. 

Next  we  need  to  define  how  the  effective  values  described  above  can 
be  used  to  measure  dynamic  range  or  the  contrast  of  the  image.  Five  pos¬ 
sible  contrast  measures  are  (Pratt,  1978) 


Contrast  Ratio  = 


9rnax 

9min 


(4.4) 


Contrast  Modulation  =  --max  min 

Q  +  q  . 
ymax  ymin 


(4.5) 


and 


Differential  Contrast  = 

9min 


Root  Contrast  Modulation  = 


/g  -  /g  . 
3max  3min 


/g  +  /g  . 
smax  amm 


Relative  Contrast  =  9ma-^  9m1n 

9T 


(4.6) 


(4.7) 


(4.8) 


where: 

g_,v  =  effective  maximum  grey  shade 

ITlaX 

gmin  =  ef^ect1ve  minimum  grey  shade 
gj  =  total  possible  range 

Note  that  each  of  these  contrast  measures  are  dependent  upon  the  same 

image  characteristics,  i.e.,  g.  and  gmav  ; thus,  these  are  two  measure- 

mi  n  max 

ments  which  are  required.  In  our  study  we  attempted  to  correlate  relative 
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contrast  with  image  utility. 


D.  Signal-to-Noise  Ratio 

Signal-to-noise  ratio  is  an  extremely  important  image  character¬ 
istic,  though  its  definition  is  entirely  subject  to  the  stochastic 
nature  of  the  particular  image  formation  process.  We  have  already 
briefly  discussed  some  aspects  of  photograDhic  and  radar  image  noise. 
Biberman  (1974)  analyzed  the  siqnal-to  noise  ratio  for  electro-opti¬ 
cal  imaging  systems  including  the  display.  Any  solely  deterministic 
image  quality  analysis  (i.e.,  not  including  random  phenomena)  can  not 
provide  an  adequate  assessment  of  the  image  because  the  probability  of 
correctly  identifying  a  target  is  strongly  dependent  upon  the  signal- 
to-noise  ratio.  Signal-to-noise  ratios  can  be  determined  either  by  a 
rigorous  theoretical  analysis  or  by  measurement-,  depending  upon  the 
intended  application  of  the  imagery  either  technique  could  suffice. 

For  this  study  the  signal-to-noise  ratio  is  defined  as 

S/N  =  x2/S2  (4.9) 

-2  2 

where  S/N  is  the  signal-to-noise  ratio  and  x  /S  is  the  ratio  of  the 
square  of  the  mean  to  the  variance  in  "homogeneous"  areas.  A  homogeneous 
area  is  a  region  in  an  image  with  stationary  statistics,  e.g.,  a  wheat 
field. 

E.  A  Geometric  Fidelity  Measure 

The  geometric  fidelity  of  the  degraded  radar  images  was  disturbed 
in  this  experiment  as  described  previously.  It  was  therefore  necessary 
to  develop  an  algorithm  which  would  measure  the  geometric  distortion 
introduced  in  these  images.  A  heuristic  approach  was  followed.  This 
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approach  was  based  on  the  realization  that  a  geometric  distortion  is  most 
visible  on  field  boundaries  and  edges.  Therefore,  the  RMS  error  between 
an  edge  image  generated  from  the  original  data  and  an  edge  image  derived 
from  the  degraded  scene  was  used  as  a  measure  of  geometric  fidelity.  The 
edge  images  were  generated  in  both  cases  by  a  Robert's  gradient.  The 
Robert's  gradient  is  defined  as 

Gr(j,k)  =  [ F ( j  ,  k )  -  F(j+l,k+l)]2  +  [F(j,k+1)  -  F(j+l,k)]2 

(4.10) 

The  edge  images  generated  by  a  Robert's  gradient  are  multi-level  images 
and  approximate  a  two-dimensional  differentiation.  If  an  edge  in  the 
degraded  image  is  offset  (distorted)  with  respect  to  the  original  scene, 
the  RMS  error  will  increase  as  the  offset  increases.  The  magnitude 
of  the  increase  is  dependent  upon  the  edge  contrast.  This  is  the 
principle  property  any  geometric  fidelity  measure  must  exhibit. 
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5.0  CORRELATION  ANALYSIS 


An  important  initial  question  that  should  be  addressed  before  any 
models  are  estimated  to  predict  image  quality  from  interpreter  responses 
is  how  much  variation  or  how  consistent  were  the  interpreters  in  judging 
the  degraded  images.  As  a  corollary  to  this  initial  question  we  are  also 
interested  in  identifying  individual  interpreters  whose  responses  vary 
significantly  with  respect  to  the  other  interpreters.  Also  as  prepara¬ 
tory  to  estimating  the  regression  models  an  analysis  to  assess  how  strong¬ 
ly  the  i nterpreter1 s  responses  are  related  to  the  image  quality  metrics 
is  necessary.  This  information  was  obtained  through  a  correlation  analysis 
of  the  responses  and  measured  data. 

Define  a  matrix  with  40  rows,  one  row  corresponding  to  each  degraded 
image,  and  M  columns  as  y.  A  column  in  y  will  represent  the  interpreter 
responses,  average  interpreter  response,  degradation  parameters,  and 
quality  metrics  for  each  of  the  40  degraded  images.  For  example,  if 
column  i  represented  the  fifth  interpreter' s  responses  and  column  j 
contains  the  average  interpreter's  responses  then  the  correlation 
between  the  average  interpreter  and  interpreter  five,  is  defined  as 


rij 


k=1 


1ki  nkj 


<»-»  si  sj 


where: 


k=l  k=l 


(5.1) 


(5.2) 
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where: 


N 


i=l 


The  above  simple  correlation  was  used  to  analyze  the  relationships  between 
the  important  elements  of  this  study. 

A.  Analysis  of  Group  and  Individual  Interpreters '  Responses 

An  analysis  was  undertaken  to  determine  the  variation  among  the 
interpreter  responses  individually  and  in  pairs.  This  was  possible  only 
because  each  interpreter  provided  five  responses  for  each  degraded  image. 
Four  of  these  responses  related  to  the  interpreter's  judgment  of  his 
ability  to  identify,  classify  or  detect  targets  in  four  target  categories, 
(1)  linear,  (2)  natural  area  features,  (3)  complex  area  features,  and 
(4)  individual  man-made  targets.  These  judgments  are  referred  to  as 
absolute  responses.  A  fifth  response  category  was  obtained  by  having 
each  interpreter  rank  the  images  from  best  to  worst;  these  form  a  set 
of  relative  responses. 

The  variation  among  the  interpreters'  responses  could  thus  be  obtained 
by  calculating  r . .  between  the  responses  of  one  interpreter  for  one  cate- 

*  J 

gory  and  another  interpreter  for  the  same  category.  Or  the  internal  con¬ 
sistency  of  each  interpreter  could  be  evaluated  by  examining  the  r..  .  for 
an  individual  interpreter  across  two  categories.  In  this  case  if  ri . 
is  near  unity  (e.g.,  greater  than  .8)  then  the  interpreter's  responses 
for  the  ith  and  jth  categories  were  highly  correlated  and  thus  the 
interpreter  was  consistent  in  evaluating  these  categories.  Note  that 
low  internal  consistency  indicates  either  that  the  interpreter  did 
not  judge  the  degraded  images  uniformly  or  that  the  criteria  (i.e., 
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the  image  information  used)  for  judging  one  category  is  unrelated  to 
the  criteria  used  for  judging  the  other  category. 

Table  1  contains  the  results  of  the  internal  consistency  analysis. 
Obviously  from  this  data  the  internal  consistency  of  all  eight  inter¬ 
preters  and  the  average  interpreter  was  very  low.  Thus  either  the 
interpreter  did  not  judge  the  degraded  images  uniformly  or  the  image 
information  required  for  each  category  is  different. 

In  Table  2  the  correlation  between  each  interpreter  and  the  "average 
interpreter"  is  presented.  If  y..  is  the  response  for  the  jth  interpreter 
to  the  itf1  image  then  the  average  interpreter  response  is  defined  as 


M 


j=l 


where: 

M  =  number  of  interpreters. 

Consider  the  correlations  for  categories  1  and  5.  The  interpreters  as 
a  group  were  fairly  consistent  in  responding  to  these  categories  as  is 
evident  by  the  magnitude  of  the  correlation  coefficient.  Also,  these 
correlation  coefficients  indicate  that  for  category  1  interpreters  1, 

4  and  6  were  the  most  consistent,  i.e.,  they  were  highly  correlated  with 
the  average.  The  same  interpreters  are  also  strongly  interrelated  while 
the  other  five  interpreters'  responses  were  relatively  uncorrelated. 

This  is  found  by  examining  the  correlation  coefficient  between  indivi¬ 
dual  interpreters.  Similarly,  interpreters  1,  2,  3,  7  and  8  are  highly 
correlated  for  category  5.  Table  2  also  indicates  that  the  interpreters 
as  a  group  were  very  inconsistent  in  responding  to  categories  2,  3,  and 
4.  This  conclusion  is  also  reached  by  examining  the  correlation  between 
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TABLE  1 


Correlation  Coefficient  for  Each  Interpreter 


. 

Category 

, 

Interpreter  # 

1 

2 

3 

D 

5 

in 

a 

8 

Ave. 

MMMMMi 

■■■Ml 

MBMM 

■MMi 

■MM 

■■■■■ 

mm 

.44 

.40 

.23 

.22 

.16 

.01 

mSm 

.25 

.50 

.06 

.18 

.23 

.11 

1-4 

.75 

.42 

.45 

.38 

.11 

.44 

.32 

.39 

1-R 

.18 

.25 

.19 

.14 

.03 

.14 

.47 

.29 

.26 

2-3 

.61 

.09 

.56 

.03 

.70 

.31 

.29 

.52 

.65 

2-4 

.40 

.31 

.11 

.03 

.54  j 

.33 

.06 

.19 

.30 

2-R 

.09 

.45 

.02 

.00 

.15 

.22 

.18 

.22 

.25 

3-4 

.38 

.45 

.05 

.41 

.61 

.07 

.51 

.39 

,49 

3-R 

.01 

.07 

.08 

.40 

.01 

.45 

.59 

.39 

.53 

4-R 

.01 

.10 

.24 

.32 

.04 

.55 

.68 

.56 

.61 

Ave. 

.30 

.28 

.26 

.20 

.26 

.25 

.37 

.33 

.36 

Sd 

.25 

.16 

.20 

.17 

.26 

.16 

.20 

.14 

.21 
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TABLE  2 

Correlation  of  Each  Interpreter  With  the 
"Average  Interpreter" 


.95 

.82 

.83 

.67 

.69 

.50 

.29 

.50 

.18 

.50 

.64 

.51 

.94 

.94 

.82 

PI 

5  6 


.76 

.95 

*3* 

CO 

O 

00 

.47 

.52 

.52 

.82 

.61 

<£> 

00 

.72 

.51 

.02 

.69 

.70 

.71 

oo 

.73 

.81 

CO 

00 

.79 

.46 

.44 


interpreters ,  i.e.,  for  these  categories  the  correlation  coefficient 
between  all  pairs  of  interpreters  was  low. 

From  this  correlation  analysis  we  expect  the  regressions  for  cate¬ 
gories  1  (linear  features)  and  5  (the  relative  ranking)  to  be  superior  to 
the  estimated  models  for  the  other  categories.  Further,  this  analysis 
indicates  that  either  the  interpreters  did  not  judge  the  degraded  images 
uniformly  or  that  the  criteria  for  judging  each  category  were  unrelated. 


B.  Analysis  of  Interpreter  Responses  and  Degradation  Parameters  and 
Image  Quality  Metrics 

To  support  the  regression  analysis  the  correlation  between  each  of 
the  degradation  parameters  and  image  quality  metrics  and  the  interpreter 
responses  was  examined.  This  analysis  provided  an  initial  indication  of 
the  dependence  of  the  image  utility  as  a  function  of  both  the  degradation 
parameters  and  quality  metrics. 

Table  3  contains  the  correlation  coefficient  between  the  average 
interpreter's  responses  and  the  parameters  and  metrics.  Overall  this 
correlation  was  low  which  indicates  that  the  metrics  and  parameters  alone 
do  not  provide  in  a  linear  sense  a  good  indication  of  how  the  inter¬ 
preters  judged  the  degraded  images.  This  reinforces  our  belief  that  a 
combination  of  image  characteristics  is  required  to  predict  the  utility 
of  radar  images. 

It  is  evident  from  the  data  that  the  parameters  and  metrics  assume 
different  levels  of  importance  as  the  response  category  was  changed.  Con¬ 
sider  the  degradation  parameters.  All  the  parameters  assumed  approxi¬ 
mately  equal  importance  (as  measured  by  the  magnitude  of  the  correlation 
coefficient)  for  categories  5  and  4,  while  only  quantization  showed  any 
dependence  for  category  3.  Similarly  for  the  quality  metrics,  the  S/N 
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TABLE  3 


Correlation  Between  the  Average  Interpreter  and 
the  Degradation  Parameters  and  Quality  Metrics 


Category 

1 

2 

3 

4 

5 

iHBn 

Degradation 

Parameters: 

Quantization 

.01 

.07 

.40 

.38 

.63 

Warp 

.08 

.13 

.07 

.41 

.56 

Bandwidth 

.11 

.34 

.08 

.46 

.65 

Spatial  Filtering 

.06 

.27 

.00 

.42 

.65 

Noise 

.15 

.17 

.16 

_ 

.38 

.57 

■■■■■■■ 

Image  Quality 

Metrics: 

RMSE 

.43 

.17 

.26 

.46 

RMSB 

.45 

.17 

.16 

.27 

Geometric  Fidelity 

.31 

.40 

.12 

.24 

.39 

S/N 

.74 

.36 

.23 

.25 

.09 

Dynamic  Range 

.21 

.22 

.37 

.40 

.52 

appears  to  be  dominant  r'or  category  1  while  it  is  completely  uncorrelated 
for  category  5.  This  is  a  significant  observation  in  that  it  shows  that 
image  characteristics  do  not  relate  to  image  utility  independent  of  the 


6.0  REGRESSION  ANALYSIS 


The  culmination  of  our  research  is  the  prediction  of  radar  image 
usefulness  from  measured  image  characteristics.  In  previous  sections  we 
have  discussed  the  design  of  the  experiment,  the  measured  image  character¬ 
istics  and  the  correlation  between  these  metrics  and  interpreter  responses. 
The  purpose  of  this  section  is  to  present  a  regression  analysis  which  was 
performed  to  estimate  the  desired  prediction  equations.  Several  different 
approaches  were  pursued  to  establish  adequate  prediction  equations.  Ob¬ 
viously  the  first  approach  was  to  estimate  a  second  order  linear  model 
from  the  observed  data.  A  mixed  model  was  used,  i.e.,  a  linear  combination 
of  image  metrics,  cross  product  terms  (continuous  variables)  and  blocking 
terms  (discrete  variables)  were  combined.  The  inclusion  of  blocking  vari¬ 
ables  attempts  to  control  undesirable  fluctuations  in  responses  occurring 
from  different  interpreters  and  scenes. 

Another  approach  to  finding  an  adequate  prediction  equation  was  to 
relate  the  interpreter  response  to  the  spatial-grey  level  volume.  This 
metric  has  been  proposed  and  applied  to  radar  imagery  in  the  past 
(Moore,  1979).  A  new  measure  was  developed  (based  on  the  SGL  volume) 
to  incorporate  the  variation  in  dynamic  range  into  the  model.  We  call 
this  metric  the  modified  SGL  volume.  The  models  based  on  both  the  SGL 
and  modified  SGL  volume  are  non-linear  as  will  be  explained  later. 

It  is  important  to  identify  the  criteria  which  will  be  applied  to 
judge  the  usefulness  of  the  estimated  models.  For  linear  models  there 
are  several  well  known  criteria.  For  example,  an  F-statistic  to  test  the 
significance  of  the  model,  the  sum  of  squares  for  error  (SSE),  and  the 
percentage  of  the  total  response  variation  explained  by  the  model  (R  , 
the  coefficient  of  multiple  determination)  are  all  commonly  used  for 


40 


linear  models;  all  of  these  criteria  will  be  reported  here  for  each 

o 

estimated  model;  however,  R  will  be  used  primarily  to  evaluate  the 
utility  of  the  prediction  equations.  Unfortunately,  it  is  difficult 
to  judge  the  quality  of  estimated  nonlinear  models  as  will  be  required 
in  evaluating  the  SGL  based  models.  Here  SSE  (and  estimated  variance) 
will  serve  as  the  judging  criteria  for  the  nonlinear  models. 

The  analysis  of  the  linear  model  will  be  presented  next;  however, 
before  the  results  are  shown  a  short  review  will  be  included  primarily 
to  define  terms.  The  SGL  volume  criteria  will  then  be  presented,  fol¬ 
lowed  by  the  estimated  models. 

6. 1  The  Linear  Model  and  Results 
6.1.1  Mathematical  Basis 

Because  there  are  no  specific  physical  laws  governing  the  relationship 
between  the  utility  of  a  microwave  image  and  the  image  characteristics,  a 
probabilistic  model  will  be  used  to  describe  this  relationship.  Specif¬ 
ically,  the  general  linear  statistical  model  will  be  used.  This  model  is 
written  as 

k 

y  =  60  +  2eizi  +  e  (6J) 

i=l 

where  the  z.'s  represent  in  this  case  known  image  parameters  (or  functions 
of  parameters)  and  the  6^'s  are  unknown  model  parameters  which  define  the 
desired  relationship.  This  model  is  called  "linear"  because  it  is  linear 
in  the  unknown  model  parameters.  Both  the  s^'s  and  z^'s  are  deterministic 
values;  the  random  component,  e,  characterizes  the  stochastic  nature  of  the 
observation  y.  The  usual  assumption  for  this  model  is  that  E[e]  =  0  and 
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the  Var  [e]  =  a  and  that  e  is  normally  distributed. 

To  reduce  the  number  of  observations  required  to  estimate  the  3^- ' s 
we  will  use  the  following  specific  form  of  the  general  linear  model: 

k  k  k  k 

* -  <o*  £vi*  (62) 

i=l  i=l  j=l  i=l 

i^j 

where: 

x.  =  ith  image  characteristic. 

This  model  assumes  that  all  three  way  interactions  (e.g.,  •  x.  •  x^  terms) 

do  not  significantly  contribute  to  the  response  y  and  thus  are  neglected. 
This  model  can  also  be  viewed  as  a  quadradic  fit  to  the  true  higher  order 
surface  which  defines  the  relationships  between  the  image  parameters  and 
the  response  y. 

6.1.2  Model  Parameter  Estimation 

The  model  parameters  are  found  by  collecting  observations  and  by 
performing  a  minimum  mean  square  estimation.  This  procedure  is  well 
known  but  will  be  reviewed  here. 

Consider  the  following  experiment.  N  observations  of  the  utility 
of  image  data  are  obtained  from  M  different  radar  images  each  processed 
to  exhibit  specific  characteristics .  Using  the  model  presented  in 
Section  6.1.1  the  experiment  can  be  mathematically  written  as 

Y  =  Ze  +  c  (6.3) 


where: 


M 


5 


G  - 


-11 


-£1 


Z  = 


-IN 


-IN 


and 


z  -  k'  +  k  +  1 


for  the  second  order  model  described  by  Equation  (6.2).  Each  e.  is 


assumed  to  be  zero  mean  Gaussian  with  identical  variance  a  ,  further 


e.  is  independent  of  e.  for  all  i,  j,  i^j.  The  matrix  z  defines  the 

*  J 


experimental  conditions  under  which  the  observations  were  made  and  note 
that 


Bi 

Ym 

for 

i 

=  0  .  .  . 

k. 

m  =  0  . 

si 

=  v  „ 

for 

i 

=  k  +  1  . 

.  2k  +  1 

'm,m 

Bi 

=  y_  _ 

for 

i 

=  2k  +  2 

.  k2  + 

'm,n 

n 

=  1  .  .  . 

k. 

m  f  n 

.  k, 
(6.4) 


and 


Z,  »  x. 


for  i  =  1  .  .  .  k,m=0.  .  .k 


Zi 


=  x. 


Z,  = 


XX 

m  n 


for  i=k+1...2k+l,m=0...k 

.2 


for  i=2k+l...k  +  k  +  1 ,  m  =  0 
n  =  0  .  .  .  k,  m  f  n 


.  k, 
(6.5) 
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The  minimum  mean  square  estimate  for  the  model  parameters,  e,  is  defined 
by  the  vector  which  minimizes,  L  defined  as 

L  »  (Y  -  Z3)T  (Y  -  Ze)  (6.6) 

Following  (Meyers,  1977) 

L  =  (Y  -  Z6)TY  -  (Y  -  Z6)T  Ze 

=  YtY  -  (Zs)tY  -  vTze  +  (Zs )T  Ze 
=  YtY  -  eTZTY  -  YTze  -  eTzTZe 

=  YtY  -  2eTzTY  +  eTzTZe  (6.7) 

Setting  —  =  0  the  best  fit  is  found  as 
36 

Ik  =  -2ZTY  +  2ZTZ6  =  0  (6.8) 

36 

solving  for 

B  =  (ZTZ)_1  ZTY  (6.9) 

The  expected  value  of  0  is  simply  found  by 

E[e]  =  E[(ZTZ)'1ZTY] 

=  E[(ZTZ)'1ZT(Z@  +  e ) ] 

=  E[(ZTZ)'1(ZTZb  +  ZTc)] 

=  E[(ZTZ)'1(ZTZ)0  +  (ZTZ)_1ZTe] 

=  16  +  E[(ZTZ)"1ZTe] 

=  B  (6.10) 
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And  the  covariance  matrix  of  8  is  found  as  (Meyers,  1977) 

Cov[6]  =  E[(e  -  b){8  -  8 )T] 

=  Cov[(ZTZ)'1ZTY] 

=  (ZTZ)_1 ZTCov(Y) 

=  [(ZTZ)'1ZT]  a2  [I(ZTZ)'W 

=  a2(ZTZ)_1  (6.11) 

The  minimum  mean  square  estimate  (MMSE)  for  the  model  parameters  has 

been  defined  by  the  observation  vector  Y  and  the  design  matrix  Z. 

Further  the  MMSE  was  found  to  be  unbiased  and  have  a  covariance 
2  T  -1 

a  (Z  Z)  .  The  covariance  matrix  can  be  used  to  establish  confidence 
intervals  for  each  of  the  model  parameters  and  it  allows  the  establish¬ 
ment  of  a  prediction  interval  around  any  observation. 

There  are  many  ways  in  which  an  experiment  can  be  designed  (i.e., 
selection  of  the  design  matrix  Z)  to  allow  efficient  estimation  of  the 
model  parameters.  Entire  textbooks  are  devoted  to  presenting  these 
techniques  (Cox,  1958;  Myers,  1977)  for  a  wide  variety  of  conditions. 
Therefore,  a  review  of  experimental  design  in  general  is  not  appro¬ 
priate  here. 

6.1.3  Analysis  of  the  Prediction  Equation 

The  result  of  applying  the  technique  described  in  the  previous 
sections  is  a  prediction  equation  which  relates  the  image  parameters  to 
data  utility  for  a  specific  application. 

The  data  gathered  in  this  experiment  would  also  directly  provide 


information  concerning  the  magnitude  of  importance  of  an  individual 


parameter  or  groups  of  image  parameters.  Analysis  of  variance  tech¬ 
niques  are  applied  to  obtain  this  information.  For  example,  suppose 
we  wish  to  test  the  hypothesis  that  image  parameter  x1  does  not  signif¬ 
icantly  affect  the  utility  of  the  sensor  data.  To  investigate  this 
question  one  would  calculate  the  sum  of  squares  for  error  (SSE)  for  the 
original  model.  The  SSE  is  defined  as 

N 

SSE-  2  (6.1 

i=l 


where: 

y.j  =  observed  response  for  the  experimental  conditions  defined  by 
the  ith  row  in  the  design  matrix  Z. 
y.j  =  estimated  (using  eq.  (6.9))  response  for  the  experimental 
conditions  defined  by  the  i u  row  in  the  design  matrix  Z. 

Note  that  SSE/  ^  N  -  U+l)^  is  an  estimate  for  a2.  Next  a  reduced 
model  would  be  defined  by  deleting  all  x-j  terms  in  the  original  model. 

A  SSE  would  then  be  calculated  using  the  reduced  model,  SSE-j .  The 
test  statistic  for  the  given  hypothesis  is  given  by 

SSE1  -  SSE 

f-g 

F  =  -  (6.13) 

SSE 

N  -  (i+1) 


where: 

N  =  number  of  observations 

+1  =  number  of  parameters  (s^)  in  the  original  model 
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g+1  =  number  of  parameters  in  the  reduced  model. 

This  test  statistic  has  a  F-distribution  (probability  density  function) 
with  ($,-g),  N-U+l)  degrees  of  freedom.  Therefore,  if  F>F^  ^  ^  (|<+1))  a 
we  reject  the  hypothesis  that  the  sensor  parameter  x-|  does  not  signifi¬ 
cantly  affect  the  utility  of  the  sensor  data  with  the  probability  of  a 
TYPE  I  error  equal  to  a.  Remember  that  a  TYPE  I  error  consists  of 
rejecting  the  hypothesis  when  it  should  be  accepted.  So  if  we  reject, 
then  the  data  has  provided  strong  evidence  that  the  image  x-j  is  impor¬ 
tant.  Similar  techniques  can  be  applied  to  any  term  or  group  of  terms 

in  the  original  model.  For  example,  we  might  wish  to  see  if  all  quadra- 
2 

tic  terms,  Xi  ,  do  not  significantly  affect  the  response.  The  motiva¬ 
tion  would  be  to  simplify  the  model  and  thus  supply  more  degrees  of 
freedom  for  estimating  the  remaining  parameters.  If  only  one  parameter, 

Zj,  is  tested  then  the  above  F  value  is  called  a  partial  F-test  value. 

An  overall  F-test  for  the  regression  is  found  by  testing  the  hypothesis 
that  B-|  =  .  .  .  =  =  0.  If  this  F-test  statistic  exceeds  some  value 

specified  from  a  selected  risk  level  a  then  the  regression  is  said  to 
be  statistically  significant.  That  is,  the  variation  of  the  data  pre¬ 
dicted  by  the  model  is  greater  than  would  be  expected  by  a  random  oc¬ 
currence  at  a  probability  of  1-cs.  Although  a  specific  estimated  regres¬ 
sion  equation  is  statistically  significant  it  does  not  follow  neces¬ 
sarily  that  this  equation  is  useful  for  prediction.  One  rule  of  thumb 
has  been  developed  (Wetz,  1964)  which  states  that  if  the  F-test  statistic 
is  four  times  greater  than  F^  a  then  the  regression  equation 

in  question  is  satisfactory  for  prediction. 

The  coefficient  of  multiple  determination,  R  ,  (Draper  and  Smith. 

1966)  is  defined  as 
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(6.14) 


R 


2 


£ 


y)2/^(yi  -  y)2 


i=l 


where: 


y  » 


1 

N 


£ 


(6.15) 


This  quantity  represents  the  percentage  of  the  variation  in  the  response 

2 

data  explained  by  the  estimated  model.  Note  that  if  N  =  «-l,  then  R  =,1 

2 

because  there  is  a  perfect  fit  to  the  data;  therefore,  when  using  R  as 
a  goodness  criterion  care  must  be  taken  to  insure  that  there  are  suffi¬ 
cient  degrees  of  freedom. 

Sometimes  it  is  convenient  to  establish  a  prediction  interval 
(Mendenhall,  1968)  around  a  specific  response.  This  interval  would 
define  a  range  in  which  some  future  response  would  lie  given  this  set 
of  parameters  and  thus  provide  an  indication  of  the  operational  utility 
of  the  images  derived  from  a  specific  system.  To  develop  the  predic¬ 
tion  interval  we  will  first  define  the  set  of  parameters  as  the  vector 
0,  i.e.. 


°T  ■  0,X  -..,x  )  (6.16) 

U1  uk 

The  expected  value  of  the  response  given  the  vector  0  is 

E[Y]  =  0Te  (6.17) 

and  its  variance  can  be  shown  to  be 

Var[?]  =  [0T(ZTZ)_10]a2  (6.18) 

The  error,  E,  between  a  future  response,  yF,  and  the  estimated  response,  y, 
as  defined  by  the  estimated  model  parameters  is 
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E  =  yF  -y 


(6. IS) 


so  clearly 


o 

II 

1 — 1 

UJ 

1 — 1 

LU 

(6.20) 

Var  [E]  =  Var  [yp]  +  Va r  [y]  -  2  Cov(yF,  y) 

(6.21) 

but  the  future  and  the  estimated  response  would  be  uncorrelated  yielding 
Var[E]  =  Var[.y]  +  Va  r[yp]  (c  ■'>?-) 

The  variance  of  the  future  utility  of  the  image  data,  yF,  is  assumed  to  be 
a2,  thus 

Var[E]  »  a2[l  +  0T(ZTZ)'10]  (6.23) 

A  prediction  interval,  (1  -  «)$,  can  now  be  defined  ( remember  y  and  yp  are 
both  normal  so  E  is  also  normal)  as 

iiWi).w2sV,t°T,zTzrl°  (6'24) 

where: 

S2  =  estimate  of  a2  =  SSE/N- (z+1 ) 

tM  »  the  t  value  for  N-U+l)  degrees  of  freedom  at  a/2. 

N- (K+l ) ,a/2 

The  interpretation  of  this  interval  is  simple;  there  is  a  (1  -  a)*  proba¬ 
bility  that  a  future  measurement  of  the  sensor's  data  utility  will  lie  in¬ 
side  this  interval. 

6.1.4  Blocking 

Additional  variation  in  the  response  could  be  introduced  into  this  ex¬ 
periment  from  differences  between  interpreters  and  differences  between  the 
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four  scenes.  A  technique  to  reduce  these  variations  is  thus  required.  A 
method  known  as  blocking  was  employed  to  accomplish  this  task.  Blocking 
will  be  explained  by  the  following  example. 

Suppose  we  conducted  this  experiment  with  only  three  interpreters  and 
two  scenes;  further,  assume  that  only  one  variable  (image  property)  is 
used.  The  following  linear  statistical  model  is  proposed: 

y  =  8q  +  B1x]  +  82x2  +  63X3  +  S4X4  +  e  (6.25) 

where: 

y  =  interpreter's  response 
x^  =  image  variable 

X2  =  1,  if  the  response  is  from  interpreter  #1 

x^  =  0,  if  the  response  is  not  from  interpreter  #1 

X3  =  1,  if  the  response  is  from  interpreter  r 2 

x3  =  0,  if  the  response  if  not  from  interpreter  #2 

x^  =  1,  if  the  response  if  from  scene  A 

x4  =  0,  if  the  response  is  not  from  scene  A 

e  =  random  error 

Note  that  x2  =  x3  =  0  implies  that  the  response  is  from  interpreter  #3, 
similarly  x4  =  0  implies  that  the  response  is  from  scene  B.  Next  con¬ 
sider  the  i nterpretation  of  82*  B^  and  6^.  Under  the  assumption  that 
interpreter  #1  and  scene  B  was  used,  the  model  takes  the  form 

y  =  60  +  B]x1  +  82x2  +  e  (6.26) 

while  the  assumption  that  interpreter  #3  and  scene  B  was  used  yields 

y  =  Bg  +  61x1  +  e.  (6.27) 
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Clearly  the  parameter  8 2  represents  the  amount  that  we  expect  the  response 
to  increase  or  decrease  on  the  average  as  we  move  from  interpreter  #3  to 
interpreter  #1.  Similarly  83  represents  the  expected  difference  between 
interpreter  #2  and  #1,  while  84  is  the  expected  difference  between  scene 
A  and  B. 

Next  consider  three  specific  observations. 

ylB  =  6o  +  61X1  +  elB  (from  interpreter  #3,  scene  B)  (6.28) 

y»B  =  Sq  +  81  x-j  +  &2x2  +  e2B  (from  interpreter  #1,  scene  B)  (6.29) 

y3B  =  e0  +  81X1  +  83x3  +  e3®  (from  interpreter  #2,  scene  B)  (6.30) 


Averaging  over  the  three  interpreters  we  obtain: 


8o  +  8r 

<1  ♦- 

82+83  _ 

3  eB 

(6.31) 

i*1 

where: 

3 

eB  = 

3  SeiB' 

(6.32) 

i=l 

Similarly 

for  scene  A 

ylA  =  80  +  81X1  + 

84X4  + 

elA 

(6.33) 

y2A  =  80  +  81X;  + 

82x2 

64x4 

+  e2A 

(6.34) 

y3A  =  80  +  81X1  + 

83x3  + 

84x4 

+  e3A 

(6.35) 

and 

yA  =  80  +  81X1  + 

B4X4  ♦  ' 

82+83 

3 

+  IA 

(6.36) 

To  estimate  the  difference 

between 

the 

responses  from  scene  A  and 

B  we 

form 
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(6.37) 


yA  -  yB  =  e4  +UA  -  eB) 

where: 

7A  -  e g  =  error  of  estimation. 

The  effects  of  Bg.  B-j  and  cance^  0ljt  thereby  reducing  the  error 

in  estimating  b4-  Similar  observations  can  be  made  concerning  B^  and 

B3. 

Blocking  reduces  the  error  associated  with  estimating  the  model 
parameters  associated  the  desired  experimental  variables  caused  by  fluctua¬ 
tions  from  different  interpretations  and  scenes.  It  also  allows  for  the 
investigation  of  the  magnitude  of  these  fluctuations. 

6.1.5  Results  of  the  Linear  Regression  Analysis 

A  full  model  with  21  parameters  defined  by  equation  (6.2)  and  10 
blocking  parameters  was  proposed  and  the  minimum  mean  square  estimation 
for  parameters  found.  Of  the  10  blocking  parameters,  7  account  for  the 
eight  interpreters,  and  3  account  for  the  four  different  scenes.  A  re¬ 
duced  model  (one  in  which  no  blocking  variables  were  incorporated)  was 
also  used.  The  purpose  of  this  section  is  to  present  the  results  of 
the  regression  analysis  and  to  indicate  appropriate  conclusions  which 
may  be  drawn  from  it. 

2 

Table  4  contains  the  coefficient  of  multiple  determination,  R  , 

SSE,  the  F-test  value,  and  the  estimated  random  variance  for  the  full 

and  reduced  models  for  all  five  response  categories.  Also  included  in 

Table  4  are  results  from  reduced  models  where  the  cross  terms,  x ^ , 

2 

i^j,  have  been  deleted  and  where  the  square  terms  x^  and  cross  terms 
have  been  removed. 

The  first  observation  to  be  made  from  this  data  is  that  the  pre^ic- 
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Table  4 


Evaluation  of  Predicted  Regression  Models 


Category 

2 

3 

4 

5 

Full  Model 

R2 

76.7 

54.0 

64.7 

52.6 

56.0 

F 

30.9 

9.45 

17.2 

10.7 

12.2 

#  Degrees  of  Freedom 

30,  281 

30,  241 

30,  281 

30,  289 

30,  289 

Fk,N-(i*1),.05 

1.52 

1.52 

1.52 

1.52 

1.52 

S2 

1.0 

1.4 

.8 

1.1 

.04 

SSE 

285.2 

346.6 

225.0 

328.3 

11.66 

Reduced  Model 
(no  blocking) 

R2 

58.9 

17.0 

9.8 

15.2 

53.8 

F 

20.9 

2.6 

1.6 

2.7 

17.4 

#  Degrees  of  Freedom 

20,  291 

20,  251 

20,  291 

20,  299 

20,  299 

Fk,N-(l+l),a 

1.62 

1.62 

1.62 

1.62 

1.62 

s2 

1.7 

2.5 

2.0 

2.0 

.04 

SEE 

503.7 

626.2 

575.5 

586.9 

12.23 

Reduced  Model 
(no  blocking  and  no 
x^Xj  terms) 

R2 

52.0 

45.5 

F 

32.6 

25.8 

#  Degrees  of  Freedom 

10,  301 

Fk,N-U+l),a 

1.85 

1.85 

S2 

1.9 

.047 

SSE 

588.6 

14.2 

Reduced  Model 
(no  blocking  and  no 

: 

2 

x  ,  and  XjXj  terms) 

R2 

40.9 

36.5 

F 

42.4 

36.1 

4  Degrees  of  Freedom 

5,  306 

5,  314 

Fk,N-(l+1 )  ,a 

2.23 

2.23 

i  2 

2.4 

SSE 

724.6 

.  .  _ _ — 

_ 

16.8 
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tion  equation  is  statistically  significant  for  all  response  categories 

when  the  blocking  variables  are  included  in  the  model.  Further  review 

shows  that  response  category  1  (linear  features)  provided  the  best  pre- 

2 

diction  equation  in  terms  of  maximum  R  . 

However,  when  the  blocking  variables  are  removed  only  response  cate¬ 
gories  1  and  5  still  produce  statistically  significant  regression  equa¬ 
tions.  Obviously  most  of  the  variation  in  the  interpreter  responses  for 
categories  2,  3  and  4  predicted  by  the  full  model  was  accounted  for  in 
the  blocking  variables  and  not  by  the  image  metrics.  This  was  expected 
from  the  results  of  the  correlation  analysis,  i.e.,  the  correlation  between 
interpreters  was  low  for  response  categories  2,  3,  and  4.  The  interpreters 
were  not  consistent  in  judging  these  categories  and  therefore  a  reason¬ 
able  prediction  equation  can  not  be  estimated.  These  results  also  indi¬ 
cate  that  reasonable  prediction  equations  can  be  obtained  for  response 
categories  1  and  5,  so  the  rest  of  our  discussion  will  only  deal  with 
these  categories.  Note  that  this  result  could  also  be  predicted  from  the 
correlation  analysis  because  the  correlation  between  interpreters  was 
reasonably  high  for  categories  1  and  5  and,  therefore,  the  interpreters 
were  consistent  in  judging  the  images  with  respect  to  these  response  cate¬ 
gories. 

Conducting  a  hypothesis  test  to  determine  if  the  blocking  variables 
are  needed  we  find  that  for  category  1 

SEE1  "  SSE  503.7  -  285.2 

?.-g  10 

F  =  -  =  -  =  21.53  (6.30) 

SSE  285.2 

N-  ( t+1 )  281 

54 


but  F(k-g),(N-(l+l)),.05  =  F10,281,.05  F  ,'88'  Therefore  “e  "°‘  reJect 
the  hypothesis  that  the  blocking  variables  are  not  required.  This  indi¬ 
cates  that  there  was  a  statistically  significant  difference  between  the 
responses  for  various  interpreters  and  scenes.  For  response  category  5 
the  F-test  value  is  1.39  while  F^q  299  05  *  1-88,  therefore  we  can  accept 
the  hypothesis  that  the  blocking  variables  are  not  required  for  the  rela¬ 
tive  ranking  of  the  degraded  images.  Note  that  this  standard  hypothesis 
test  is  constructed  to  provide  information  v/hen  the  hypothesis  is  rejected 
as  with  category  1.  However,  given  the  magnitude  of  the  F-test  value  we 
can  make  some  comments  as  to  how  strongly  we  believe  that  the  hypothesis 
is  truly  correct.  This  is  done  by  computing  the  probability  that  f  <  F 
where  f  has  a  F-pdf  with  10  and  299  degrees  of  freedom.  If  this  probability 
is  close  to  1-a  then  we  would  not  feel  strongly  that  the  hypothesis  is 
correct;  however,  if  P[f  <  F]  is  far  from  1-a  then  it  would  be  reasonable 
to  accept  the  hypothesis.  For  this  case  p  =  P[f  <  F]  =  .81.  We  feel  that 
this  is  strong  enough  evidence  to  indicate  that  the  blocking  variables  are 

not  required  for  category  5.  Using  the  same  approach  we  found  that  both 

2 

the  cross  terms  x.x.  and  the  squared  term  x.  are  required  in  the  model 

1  J  * 

for  both  response  categories. 

Now  that  we  have  established  the  quality  of  the  prediction  equation 
it  is  left  to  present  the  coefficient  estimates  and  discuss  the  relative 
importance  of  various  model  parameters.  Four  models,  i.e.,  prediction 
equations,  will  be  presented  for  response  categories  1  and  5. 

Table  5  contains  the  F-test  value  for  ten  reduced  models.  These 
results  were  generated  to  establish  the  relative  importance  of  each  of 
the  five  metrics.  The  first  five  of  the  models  contained  the  blocking 
variables  while  the  variables  (linear,  squared,  and  cross)  associated  with 


Table  5 


Evaluation  of  the  Importance  of  Image  Quality  Parameters 


Category 

1 

5 

Reduced  Model  Variable 
Terms  Removed 

RMSE 

R2 

76.26 

53.56 

F 

.99 

2.67 

RMSB 

R2 

74.8 

52.3 

F 

3.93 

3.92 

Geo 

R2 

76.26 

52.4 

F 

.98 

3.96 

S/N 

R2 

75.2 

51.8 

F 

3.13 

2.25 

DYN 

R2 

76.3 

38.6 

F 

.92 

19.21 

Blocking  and  Variable 
Terms  Removed 

RMSE 

R2 

47.7 

50.3 

F 

13.5 

3.88 

RMSB 

R2 

54.7 

50.7 

F 

5.09 

3.63 

Geo 

R2 

48.4 

49.6 

F 

12.63 

6.83 

S/N 

R2 

55.0 

48.6 

F 

4.73 

6.17 

DYN 

R2 

51.1 

33.8 

F 

9.39 

22.17 
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each  of  the  metrics  was  deleted.  The  second  five  models  deleted  both  the 
blocking  variables  and  the  specific  metric  variables.  The  higher  the 
F-test  value  for  a  particular  model  the  more  significant  the  deleted 
variables  are  for  predicting  the  response. 

As  indicated  in  the  correlation  analysis  the  order  of  importance  of 
the  five  metrics  is  different  for  different  response  categories.  Consi¬ 
dering  the  models  with  the  blocking  variables  included  we  notice  that  the 
RMSB  and  S/N  have  about  equal  importance,  while  with  the  other  three  we 
accept  the  hypothesis  that  they  are  not  required  taken  individually  in 
the  model  with  a  p  =  .56  for  category  1.  For  category  5  all  the  metrics 
are  required  in  the  model  but  the  dynamic  range  is  by  far  the  most  sig¬ 
nificant  with  the  other  four  being  about  of  equal  importance.  Interest¬ 
ingly,  when  the  blocking  variables  are  removed  the  order  of  importance  of 
the  metrics  is  changed  for  category  1.  Now  the  RMSB  and  S/N  are  the  least 
important  metrics  for  category  1.  However,  because  the  blocking  variables 
are  required  for  category  1  the  initial  ordering  of  the  importance  of  the 
metrics  will  be  accepted. 

Tables  6-13  contain  the  estimated  regression  coefficient  estimates 
for  several  different  models  for  category  1  and  5.  These  tables  include 
upper  and  lower  confidence  limits  (a=.l)  for  each  coefficient  estimate, 
its  standard  error,  adjusted  sum  of  squares  and  its  partial  F-test  value. 
These  tables  represent  image  quality  predictions  based  on  measured  image 
properties. 

From  this  regression  analysis  several  conclusions  can  be  stated: 

a)  Statistically  significant  regression  equations  can  not  be  esti¬ 
mated  for  response  categories  2,  3,  and  4. 

b)  Statistically  significant  regression  equations  can  be  estimated 
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TABLE  6 

Regression  Coefficients  for  Response  Category  1  for  the  Full  Model 


.36198  1.95391  5.26981  2.00931 
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TABLE  10 

Regression  Coefficients  for  Response  Category  5  for  the  Full  Model 
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TABLE  11 

Regression  Coefficients  for  Response  Category  5  Reduced  Model  (No  Blocking) 
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for  response  categories  1  and  5  even  when  the  models  are  re¬ 
duced. 

c)  The  blocking  variables  are  required  for  response  category  1 
but  not  for  response  category  5. 

d)  The  square  and  cross  product  terms  are  required  for  both 
response  category  1  and  5. 

e)  RMSB  and  S/N  are  the  most  important  variables  for  response 
category  1. 

f)  Dynamic  range  is  the  most  important  variable  for  response 
category  5. 


6.2  Non-Linear  Models  and  Results 


Recently  a  study  was  conducted  (Moore,  1979)  which  related  the  "inter- 
pretabil ity"  of  radar  images  to  the  product  of  the  spatial  resolution  and 
grey-level  resolution.  The  purpose  of  this  section  is  to  determine  if 
this  relationship  can  be  derived  from  the  experimental  data  used  here. 

The  product  of  the  spatial  resolution  and  the  grey-level  resolution 
is  defined  as  the  spatial-grey-level  (SGL)  volume.  The  nonlinear  model 
proposed  in  Moore  (1979)  to  relate  interpretabil ity  to  the  SGL  was 


1  =  *o exp  [i] 

where: 

Iq  and  Vc  are  the  model  parameters 
v  ’  rsrgL 

rs  =  spatial  resolution  (2-dimensional) 
rgL  =  9rey-1evel  resolution 


(6.39) 
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The  grey- level  resolution  is  a  measure  of  the  width  of  the  noise  proba¬ 
bility  density  function  (pdf)  for  a  specific  system,  i.e.,  the  noise  vari¬ 
ance  is  dependent  upon  the  amount  of  averaging  performed  by  the  system. 
The  measure  used  in  Moore  (1979)  to  measure  the  width  of  the  pdf  was  the 
ratio  of  90%  to  the  10%  points.  For  a  Gaussian  approximation  of  the 
gamma  pdf  which  describes  the  radar  image  noise  rgL  becomes 


gL 


vfi  +  1.282 
#  -  1.282 


(6.40) 


where: 

N  =  number  of  looks  averaged  by  the  radar. 

In  our  experiment  we  did  not  measure  r^L  as  a  measure  of  the  width 
of  the  noise  pdf,  rather  we  used  the  S/N  ratio.  Also,  the  two-dimensional 
spatial  resolution  was  not  measured  directly  rather  the  RMSB  was  obtained, 
however  the  spatial  resolution  can  be  approximated  for  a  full  focused 
system  by 

rs  =  1/RMSB.  (6.41) 


Therefore,  the  SGL  volume  used  in  our  experiment  is  defined  as 


(RMSB) (S/N) 


(6.42) 


In  addition  a  modified  SGL  volume  was  defined  as 


Vm  =  (RMSB) (S/N) (D) 


(6.43) 


where: 

D  =  Dynamic  range. 


The  modified  SGL  volume  attempts  to  account  for  the  variation  in  dynamic 
range  introduced  into  the  degraded  images.  The  original  exponential 
model  was  also  slightly  modified  to  allow  for  a  possible  bias  so  the 
nonlinear  model  used  here  was 

I  -  o  +  e"YV  (6.44) 

Because  of  the  results  of  the  correlation  and  linear  regression 
analysis  only  response  categories  1  and  5  will  be  reported.  Table  14 
contains  the  parameter  estimates  for  categories  1  and  5  along  with  the 
SSE.  From  this  data  and  plots  of  Vm  and  V  versus  the  interpreter 
responses  we  concluded  that  this  nonlinear  model  does  not  provide  an 
acceptable  description  of  the  interpreter  responses  for  this  experiment. 
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Table  14 


Image  Quality  Based  on  the  Spatial-Grey-Level  Volume 


Category 

_ 

1 

5 

SGL  Volume 

Modified 

SGL  Volume 

SGL  Volume 

Modified 
SGL  Volume 

a 

.765 

2.4 

2.61 

e 

3.75 

147.0 

39.7 

4.0 

Y 

-.01 

-.18 

-.01 

SSE 

1004.4 

864.0 

740.5 

s2 

3.25 

2.73 

2.34 
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7.0  CONCLUSIONS 


The  purpose  of  this  study  was  to  investigate  the  relationship  be¬ 
tween  measurable  properties  of  radar  images  and  the  utility  of  those 
images  for  specific  information  extraction  tasks.  It  was  hoped  that 
such  a  relationship  would  be  useful  for  system  design  and  image 
simulation.  Five  measurable  image  properties  were  identified,  dynamic 
range,  signal-to-noise  ratio,  image  bandwidth,  geometric  fidelity, 
and  root-mean  square  error.  These  metrics  were  determined  to  be  linked 
to  independent  characteristics  of  the  radar  image  and  were  either  direc¬ 
tly  or  indirectly  related  to  many  image  quality  parameters  proposed  in 
the  past. 

Clearly  there  are  no  physical  laws  governing  the  functional  depen¬ 
dence  of  image  utility  upon  these  image  metrics,  therefore  an  experiment 
was  conducted  to  empirically  estimate  a  functional  form  for  describing 
this  dependence.  This  experiment  consisted  of  obtaining  SAR  imagery 
and  digitally  processing  it  to  create  a  set  of  radar  images  with  con¬ 
trolled  levels  of  image  "quality".  These  images  were  presented  to  human 
interpreters  who  were  asked  to  evaluate  the  usefulness  of  each  image  for 
extracting  four  classes  or  categories  of  terrain  features,  linear  fea¬ 
tures,  natural  area  features,  complex  area  features,  and  individual 
man-made  targets.  In  addition  to  these  absolute  responses  or  rankings 
of  the  images,  the  interpreters  also  rank  ordered  the  images  from  best 
to  worst  relative  to  his  evaluation  of  the  utility  of  each  image  for 
assessing  vehicle  movement  and  activity  level.  Thus  for  each  radar 
image  the  interpreter  provided  five  responses.  Also  for  each  radar 
image  each  of  the  five  image  metrics  were  calculated.  A  functional 
relationship  between  each  response  category  and  the  image  metrics  was 
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then  estimated. 

The  first  conclusion  drawn  from  this  experiment  was  that  statis¬ 
tically  significant  regression  equations  could  not  be  obtained  to  relate 
extracting  natural  area  features,  and  individual  man-made  targets  to 
the  quality  of  radar  images  as  judged  by  the  interpreters .  A  possible 

explanation  for  this  is  that  the  interpreters  did  not  use  uniform 
criteria  for  these  response  categories  owing  to  the  complexity  of  these 
image  features.  However,  for  the  simpler  categories,  i.e.,  linear 
features  and  relative  ranking,  statistically  significant  regression 
equations  could  be  estimated.  (Only  second  order  interactions  were 
considered  in  the  regression  equation.)  Next  it  was  found  that  for 
these  categories  the  full  second-order  model  was  required  to  maintain 
the  statistical  significance  of  the  regression  equation.  Another  con¬ 
clusion  reached  by  this  experiment  was  that  different  image  metrics 
assume  varying  levels  of  importance  as  the  response  category  was 
changed.  For  example,  bandwidth  and  the  signal -to-noise  ratio  were  the 
most  important  metrics  in  estimating  the  ability  of  an  interpreter  to 
extract  linear  features  from  radar  images.  Dynamic  range  was  predominant 
for  estimating  how  an  interpreter  would  relatively  rank  the  radar  images. 
This  observation  has  important  ramifications  for  the  application  of  image 
quality  metrics  for  multi-mission  sensor  design.  That  is,  the  system 
designer  will  have  to  trade-off  system  performance  as  a  function  of  the 
application.  A  further  conclusion  obtained  was  that  the  nonlinear  re¬ 
gression  equations  based  upon  the  SGL  volume  were  not  statistically 
significant  given  these  exDerimental  data. 

Recent  advances  in  SAR  systems,  specifically  the  availability  of 
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digital  SAR  images  obtained  from  spaceborne  platforms  and  digital  image 
processing  technology  signify  that  more  information  extraction  tasks 
will  be  performed  by  computer.  Thus  it  is  recommended  that  a  study 
of  this  type  be  conducted  to  determine  the  relationship  between  mea¬ 
surable  image  properties  which  are  related  to  system  parameters  and  the 
success  of  automated  information  extraction  algorithms  for  radar.  For 
examDle,  how  is  the  probability  of  correct  classification  of  targets 
related  to  measurable  image  properties.  Using  the  computer  to  measure 
the  utility  of  the  radar  images  will  remove  the  major  disadvantage  of 
the  current  study,  i.e.,  relying  on  human  judgments. 
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APPENDIX  A 
INTERPRETER  PACKET 


INTRODUCTION 


The  purpose  of  this  study  is  to  establish  quantitative  techniques 
for  predicting  interpreter  performance  when  an  imaging  radar  is  used  as 
the  reconnaissance  sensor.  This  study  consists  of  several  phases.  First, 
quantitative  radar  image  quality  factors  were  derived,  then  radar  imagery 
was  processed  to  exhibit  controlled  levels  of  image  quality.  Next, 
empirical  data  is  to  be  gathered  on  interpreter  performance  vs.  image 
quality.  Last,  this  data  will  be  used  to  derive  the  relationship  be¬ 
tween  quantitatively  measured  image  quality  and  interpreter  performance. 

Your  contribution  to  this  study  is  to  interpret  the  given  radar 
imagery  following  the  instructions  and  guidelines  and  thus  to  provide  the 
required  empirical  data.  Also  as  part  of  the  study  the  questions  dealing 
with  interpretation  experience  need  to  be  answered. 

The  original  radar  imagery  was  collected  by  an  X-band  sensor  with 
HH  polarization  and  resolution  of  approximately  15  ft.  with  the  look 
direction  always  left  to  right.  There  are  four  different  target  areas  used 
in  this  study.  The  numbering  scheme  assigned  100-199  and  500-599  to 
Scene  A,  200-299  and  600-699  to  Scene  B,  300-399  and  700-799  to  Scene  C, 
and  400-499  and  800-899  to  Scene  D.  The  target  scenes  are  in  the  Roanoke, 
Virginia  area. 

Ancillary  data  are  also  provided  in  the  form  of  aerial  photographs 
(scale  =  1:16000,  taken  in  March  1968),  USGS  maps  (scale  1:24000,  1968) 
and  enlargements  of  the  original  SAR  imagery  (scale  =  1:16000,  collected 
in  June  1968).  Also  provided  will  be  lists  of  targets  of  interest  contained 
within  each  scene  and  definition  of  the  ranking  criteria  for  each  target. 
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You  should  use  the  ancillary  data  provided  to  locate  each  indicated  target 
within  each  scene,  then  use  the  guidelines  for  ranking  the  quality  of  each 
target  signature.  We  greatly  appreciate  your  assistance  in  this  study. 
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INSTRUCTIONS 


A.  Interpret  the  images  in  the  order  presented. 

B.  Use  all  available  standard  photo-interpretation  equipment. 

C.  Use  the  air  photos,  maps  and  radar  images  provided  to  locate  each 
target  within  the  processed  radar  images. 

D.  For  each  scene  rank  only  those  targets  which  are  indicated  as  being 
in  that  area. 

E.  Provide  the  numerical  ranking  for  each  target  signature  according  to 
the  given  definitions  and  guidelines. 

F.  Be  consistent  in  judging  each  image,  that  is,  use  the  given  guidelines 
for  evaluating  each  target. 

G.  Take  at  least  a  10  minute  rest  after  every  hour  of  interpretation. 
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QUESTIONS 


Please  answer  the  following  questions: 

1.  Number  of  years  radar  image  interpretation  experience?  _ 

2.  General  mission  objectives  of  your  image  interpretation  work? 


3.  List  all  photo-interpretation  equipment  used  for  the  analysis  of 
these  test  images. 


4.  Time  used  for  this  study? 
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GUIDELINES 


These  guidelines  provide  definitions  for  each  target  and  its  inter¬ 
pretation  levels.  In  general  the  highest  interpretation  level  of  the  six 
available  is  identification  which  indicates  that  the  target  can  be  dis¬ 
tinguished  from  all  others  similar  to  it,  e.g.,  an  automobile  bridge 
instead  of  just  a  bridge.  This  ranking  is  divided  into  two  levels  of 
certainty--possible  and  probable.  A  probable  identification  is  when  it 
is  probable  that  the  target  signature  can  be  correctly  identified  while 
a  possible  identification  is  when  the  target  signature  can  possibly  be 
correctly  identified. 

The  next  interpretation  level  is  classification.  'Classification  is 
being  able  to  group  a  target  return  into  a  category,  for  example,  being 
able  to  classify  a  signature  as  a  bridge.  Again  this  level  is  divided  into 
two  levels  of  certainty.  A  probable  classification  is  when  a  signature 
can  probably  be  grouped  into  a  category  while  a  possible  classification 
is  when  a  signature  can  possibly  be  correctly  grouped  into  a  category. 

The  lowest  level  is  detection.  Detection  is  when  there  is  enough  tar¬ 
get  signature  to  determine  that  there  is  something  there  while  not  being 
able  to  either  identify  or  classify  it.  A  possible  detection  is  when  it  is 
possible  that  there  is  a  target  signature  present  while  a  probable  detection 
is  when  it  is  probable  that  a  target  signature  is  present.  No  detection 
is  when  no  target  signature  is  present. 

The  procedure  for  attaching  one  of  these  interpretation  levels  to  each 
target  feature  will  be  as  follows: 

1.  Check  to  see  if  the  target  is  present  in  the  scene. 

2.  If  it  is  present,  next  find  its  location  in  the  ancillary  photos, 
SAR  images  and  maps  and  in  the  test  image. 


3.  Next  refer  to  the  definition  of  the  numerical  rankings  for  this 
target. 

4.  Examine  this  target  signature  in  the  test  image. 

5.  Assign  a  numerical  ranking  of  identifiabil ity  based  on  the 
supplied  definitions. 

For  example,  consider  the  airport  category.  The  ancillary  data  are 
used  to  find  the  location  of  its  target  signature  within  the  degraded 
image.  If  its  target  signature  is  not  present  in  the  degraded  image  then 
a  numerical  ranking  of  0  is  assigned.  A  ranking  of  1  is  used  if  its  tar¬ 
get  signature  is  possibly  present  while  a  ranking  of  2  is  assigned  if  its 
target  signature  is  probably  present.  If  the  target  signature  can  possibly 
be  classified  as  an  airport  then  a  ranking  of  3  is  used,  while  if  the 
signature  can  probably  be  classified  as  an  airport  then  a  ranking  of 
4  is  assigned.  If  further  information  about  the  target  signature  can  be 
obtained  then  a  ranking  of  5  or  6  will  be  used.  Specifically,  if  the  appli¬ 
cation  of  the  airport  can  possibly  be  identified  then  a  ranking  of  5  is 
used,  while  if  the  application  of  the  airport  can  probably  be  identified 
then  a  ranking  of  6  is  assigned. 
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ANSWER  SHEET 


Numerical  Rankings 
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Interpretation  Categories 


Linears  Mostly  Along  T rack” 
Roads 
Paved 

Two  Lane . 

Four  Lane . 

Linears  Mostly  Cross  Track1* 
Roads 
Paved 

Two  Lane . 

Four  Laned . 

Linears  Mostly  Diagonal 
Roads 
Paved 

Two  Lane . 

Four  Lane . 

Unpaved 

Two  Lane . 

Divided  Highway . 


Natural  Area  Features 
Forest 

Deciduous . 

Mixed . 

Agriculture . 

Rivers  and  Streams  ■ 

Complex  Area  Targets 


Airports . 

Railroad  Yards . 

Industrial  Areas . 

Commercial  Areas . 

Residential  Areas . 

Individual  Man-Made  Targets 

Buildings . 

Bridges . 

Storage  Tanks . 
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.  For  these  ratings  indicate  in  the  appropriate  square  whether  the  identification 
was  based  upon  direct  evidence  or  from  the  target  context. 
a  Angle  of  intersection  of  road  and  the  flight  path  ■  0-15° 
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e  Angle  of  intersection  of  road  and  the  flight  path  •  15-75° 

d  Roads  wide  enough  to  carry  four  lanes  of  traffic  are  considered 

four  lane  even  if  the  nnt<irte  twn  lan«  are  inert  fnr  narkinn  A- 0 


/ 


£ 


& 

«$» 


6 

5 

A 

B 

c 

D 

*D 

1 

*D 

*C 

4 

3 

2 

1 

0 

• 

• 

• 

• 

i 

• 

2 

111 

111 

1111 

•  :  S&i 

• 

• 

• 

• 

3 

• 

4 

IvXwlv; 

111 

• 

• 

• 

5 

• 

• 

6 

pi 

• 

• 

• 

7 

• 

• 

8 

ivX-;-;-:': 

;XwXvX 

• 

9 

• 

• 

10 

• 

• 

11 

• 

• 

12 

llll:/; 

• 

13 

• 

• 

14 

• 

• 

15 

• 

• 

• 

16 

• 

• 

• 

• 

17 

1 

M 

i&xSi&i 

• 

• 

• 

• 

18 

• 

• 

19 

• 

20 

• 

21 

• 

22 

D 

C 

0i 

c 

4 

3 

2 

1 

0 

□= 

□= 

Interpreter  Identification  #  _ 

Separate  the  images  by  scene  and  rank  them  from  best  to  worst  in  terms  of  the  given  applications. 

For  vehicle  movement  consider  the  potential  for  the  movement  of  military  vehicles,  e.g.  heavy 
trucks  or  tanks,  through  the  given  terrain.  Terrain  features  which  are  important  for  assessing 
vehicle  movement  include  bridge  capacity,  forest  density,  topographic  slopes,  water  bodies  and 
their  depth,  soils,  street  size,  and  road  types,  e.g.  paved  and  unpaved.  Evaluate  and  rank  the  images 
with  respect  to  the  quality  of  each  image  for  specifying  vehicle  movement  potential. 

For  activity  level  consider  your  ability  to  detect  specific  levels  of  activity.  That  is,  order  the 
degraded  images  from  best  to  worst  based  on  your  assessment  of  the  quality  of  each  image  for 
detecting  levels  of  activity.  For  example,  the  relative  quality  of  each  image  for  detecting  a  convoy 
on  the  roads,  a  large  increase  or  decrease  in  aircraft  at  the  airport,  or  a  large  increase  or  decrease 
of  rail  cars  at  the  railyard. 


Vehicle  Activity 
Scene  A  Movement  Level 


Vehicle  Activity 
Scene  C  Movement  level 


Vehicle  Activity 
Scene  B  Movement  Level 
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oiihi  t<;hfd  PAPERS  AND  SCIENTIFIC  PERSONNEL 
PUBLI SUPPORTED  BY  RESEARCH  AGREEMENT 


PUBLISHED  PAPERS 


"Simulation  of  Imaging  Radar  Systems,"  (J.  Holtzman,  V.  Kaupp,  V.  Frost, 

J.  Abbott,  and  R.  Martin),  Eighth  Annual  Pittsburgh  Conference  on 
Modeling  and  Simulation,  The  University  of  Pittsburgh,  Pittsburgh, 
Pennsylvania,  April  21-22,  1977. 

"Image  Synthesis  for  SAR  Systems,  Calibration,  and  Processor  Design," 

(J.C.  Holtzman,  V.S.  Frost,  J.L.  Abbott,  and  V.  Kaupp),  Proceedings 
of  the  Synthetic  Aperture  Radar  Technology  Convention,  March  8-10, 
1978,  Las  Cruces,  New  Mexico. 

"An  Image  Simulation  Model  for  Radar  Guidance,"  (J.C.  Holtzman,  V.  Kaupp, 
V.S.  Frost,  J.  Abbott),  Ninth  Annual  Pittsburgh  Conference  on 
Modeling  and  Simulation,  The  University  of  Pittsburgh,  Pittsburgh, 
Pennsylvania,  April,  1978. 

"Radar  Image  Simulation,"  (J.C.  Holtzman,  V.S.  Frost,  J.  Abbott, 

V.  Kaupp),  IEEE  Transactions  on  Geoscience  Electronics,  Vol .  GE-16, 
No.  5,  October,  1978. 

"Development  of  Statistical  Models  for  Radar  Image  Analysis  and 

Simulation,"  (V.S.  Frost),  Master's  Thesis,  University  of  Kansas, 
1978. 

"Computer  Generated  Radar  Images  for  Navigation,"  (J.C.  Holtzman, 

V.S.  Frost,  J.A.  Stiles,  and  V.  Kaupp),  Proceedings  of  the  Military 
Electronics  Expo  '78,  Anaheim,  California,  November  14-16,  1976. 

"Seasonal  Effects  on  Radar  Imagery  as  Predicted  by  the  PSM  Simulation 
Techniques,"  (J.C.  Holtzman,  V.S.  Frost,  J.A.  Stiles,  E.E.  Komp, 

E.S.  Bergan,  and  V.H.  Kaupp),  Tenth  Annual  Pittsburgh  Conference 
on  Modeling  and  Simulation,  The  University  of  Pittsburgh,  Pittsburgh, 
Pennsylvania,  April,  1979. 

"Radar  Image  Simulation:  A  Project  to  Develop  a  Model,  Define  its 
Operational  Constraints,  Validate  its  Accuracy  and  to  Produce 
Sample  Results,"  (V.H.  Kaupp),  Doctor  of  Engineering  Thesis, 
University  of  Kansas,  May,  1979. 

"A  Digital  Computation  Technique  for  Radar  Scene  Simulation:  New  SLAR," 
(J.C.  Holtzman,  V.S.  Frost,  J.L.  Abbott,  E.E.  Komp,  V.H.  Kaupp), 
Simulation.  June,  1979. 

"Radar  Image  Preprocessing,"  (J.C.  Holtzman,  V.S.  Frost,  J.A.  Stiles, 

D.N.  Held),  Sixth  Purdue  Symposium  on  Machine  Processing  of  Remote 
Sensing  Data,  Purdue  University.  West  Lafayette.  Indiana.  June  2-6. 
1980. 

"Digital  Preprocessing  of  SEASAT  Imagery,"  (J.C.  Holtzman,  J.A.  Stiles, 
V.S.  Frost,  D.N.  Held),  International  Conference  on  Communications, 
Seattle,  Washington,  June  8-11 ,  1980. 
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J.C.  Holtzman 

Principle 

Investigator 

V.H.  Kaupp 

Senior  Project 
Engineer 

Doctor  of 
Engineering 
(Spring  1979) 

V.S.  Frost 

Research 

Engineer 

Master  of 
Science 
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Engineer 
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